Welcome to the documentation of the EUROSENTIMENT project¶
The aim of the EUROSENTIMENT project is to create a pool for multilingual language resources and services for Sentiment Analysis.
With this pool, both language resource owners and sentiment analysis service developers can share their knowledge and experience with the Sentiment Analysis community. But it also enables content providers to benefit from sentiment analysis with little effort.
You might want to check the EUROSENTIMENT portal to start enjoying the language resource pool (LRP). It contains a list of all the existing language resources and services provided by the project partners. Or you can register and contribute with your own language resources and services.
If you’re still not sure about joining, please check our demonstrator to see what EUROSENTIMENT can offer you, and get an example of our language resources and services.
This is the reference documentation for the Language Resource Pool and its services, as well as the API to access other information
Getting started¶
Getting Access to the Platform¶
To consume the services provided by EUROSENTIMENT programmatically, you need to provide an access token in each request. It can be easily obtained following these simple steps:
Go to the EUROSENTIMENT LRP
Log in (or sign up if you don’t have an account already)
Visit your profile
Copy your Token
Managing Subscriptions¶
In this menu, the users can check their account subscription, their activities in consuming/producing services and resources, and their bills/reports.
My Subscription¶
By clicking on Subscription -> My Subscription, users are presented with this page:

In this page, users can see:
Their type of account (Language Resource Owner or Service Developer or Content Provider)
Their personal access token;
The account registration start/end date, the day of their monthly bill calculation and the status of their account.
Users can change the type of their account or reset the token whenever, through the Change modality and Reset token buttons.
My Consumptions¶
In the section Subscription -> My consumptions consumers can see their own statistics:

Every time that users access a language resource or an analysis service, the system tracks this access by exploiting the personal access token; once a day, the stats module reads the logs and calculates the statistics.
The section Subscription -> My consumers shows the consumptions by other users of the resources/services belonging to the current user.

Billing¶
In the section Subscription -> Billing the consumer can see his own bills:

The picture above shows the bills of a user; the tabs allow the user to see revenue reports too. In this case, the user hasn’t registered any services/resources, so the reports contain zero in every field. The total amount of fees is the result of the sum between the subscription fees (200) and the consuming fees (0,5). The billing module exploits the logs for calculating the bills. The calculation is performed once a month, when the current day is the same of the day of the account registration date: for each account, the billing module calculates a bill and a revenue report.
Users can download their bills/reports:

The Eurosentiment Format¶
The Eurosentiment format is an extension of already stablished formats for Linguistic Linked Data (lemon, NIF, Onyx, Marl, etc.) for their use in Sentiment Analysis. It covers the description of lexica and corpora on the language resource side, and the results from services.
In the following sections we will cover both cases individually.
The EUROSENTIMENT format for services and corpora¶
The Eurosentiment format is an extension of the NIF format data model for use in Sentiment Analysis. However, NIF and the Eurosentiment differ in one respect: Eurosentiment sets JSON-LD as its primary serialisation format, whereas NIF defaults to XML+RDF or turtle. It includes properties from Marl, Onyx and other ontologies that complement those in NIF for sentiment and emotion tagging. However, NIF and the Eurosentiment differ in one respect: Eurosentiment sets JSON-LD as its primary serialisation format, whereas NIF defaults to XML+RDF or turtle.
JSON-LD is a subset of JSON that makes it possible to embed semantic information in plain JSON objects. It retains full compatibility with JSON while adding useful information.
By using this serialisation format, Eurosentiment targets both semantic web developers and traditional developers alike.
Overview¶
{
"@context": [
"http://demos.gsi.dit.upm.es/eurosentiment/static/context.jsonld",
],
"@id": {{ processID }},
"analysis": [
{
"@id": {{ analysisID }},
"@type": [
{{ analysisType }}
],
"prov:wasAssociatedWith": {{ agent }},
"dc:language": {{ language}},
"marl:maxPolarityValue": {{ minValue }},
"marl:minPolarityValue": {{ maxValue }}
}
[...]
],
"domain": {{ domain }},
"entries": [
{
"@id": {{ entry_id }},
"dc:subject": {{ topic }},
"emotions": [
{
"prov:generatedBy": {{ analysisID }},
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": {{ emotions[i].category }},
"onyx:intensity": {{ emotions[i].emotion_intensity }}
},
[...]
]
}
[...]
],
"opinions": [
{
"prov:generatedBy": {{ analysisID }},
"marl:polarityValue": {{ opinions[i].polarityValue }},
"marl:hasPolarity": {{ opinions[i].polarity }},
"marl:describesObject": {{ opinions[i].described_object }},
},
[...]
],
"nif:isString": {{ string_representation }},
"strings": [
{
"nif:anchorOf": {{ strings[i].value }},
"itsrdf:taIdentRef": {{ strings[i].entity }},
"nif:posTag": {{ strings[i].posTag }},
"nif:lemma": {{ strings[i].lemma }}
},
[...]
]
},
[...]
]
}
- processID
- Is the ID of the process that gathered the results.
- domain
- Domain detected in the entries, or used by the analysis
- analysis
A set of results can be produced by combining the results from several analysis processes. Each of them needs to be described here.
analysisID: Each of the analysis needs an unique URI so that the generated opinions/emotions can be linked to it. A set of results may aggregate the results from independent analysis (e.g. a sentiment analysis and an emotion analysis) analysisType: Example: marl:SentimentAnalysis or onyx:EmotionAnalysis algorithm: [In marl] Algorithm that was used to generate the results agent: Responsible for or creator of the analysis - language
- Language that the analysis uses. e.g. “es”
- minValue
- [In marl opinions] Minimum value of the opinion value
- maxValue
- [In marl opinions] Maximum value of the opinion value
- domain
- Domain where the analysis was run. e.g. wnd:electronics
- entry_id
- Each entry must have a unique URI
- topic
- The subject or subjects of the entry. e.g. wnd:electronics
- emotions
The emotions found in the context. Depending on the theory of emotions used, emotions can be categorised and/or be defined by different dimensions. This example represents the usual case which is a model using categories.
- category
- Category of the emotion. e.g. wna:Hatred
- emotion_intensity
- Intensity of the emotion as defined by the algorithm
- opinions
The opinions found in the context.
- polarity
- Polarity of the opinion. e.g. marl:Positive
- polarityValue
- Numerical value of the polarity, as a floating point
- described_object
- Object that the opinion is about
- string_representation
- Plain text representation
- strings
A NIF context can be subdivided in substrings, which have their own properties. This is usually done to associate a particular string with an entity in Named Entity Recognition
- strings[i].value
- Text representation
- strings[i].entity
- Entity the string represents
- strings[i].posTag
- Part-of-speech tag
- strings[i].lemma
- Lemma of the word
Context¶
The JSON-LD context contains semantic information about the properties in the JSON document, including convenient prefixes or namespaces. The Eurosentiment context would look like this:
{
"@context": {
"dc": "http://purl.org/dc/terms/",
"dc:subject": {
"@type": "@id"
},
"emotions": {
"@container": "@list",
"@id": "onyx:hasEmotionSet",
"@type": "onyx:EmotionSet"
},
"marl": "http://www.gsi.dit.upm.es/ontologies/marl#",
"nif": "http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#",
"onyx": "http://www.gsi.dit.upm.es/ontologies/onyx#",
"opinions": {
"@container": "@list",
"@id": "marl:hasOpinion",
"@type": "marl:Opinion"
},
"prov": "http://www.w3.org/ns/prov#",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"analysis": {
"@id": "prov:wasInformedBy"
},
"entries": {
"@id": "prov:generated"
},
"strings": {
"@reverse": "nif:hasContext",
"@type": "nif:String"
},
"wnaffect": "http://www.gsi.dit.upm.es/ontologies/wnaffect#",
"xsd": "http://www.w3.org/2001/XMLSchema#"
}
}
Examples¶
- Annotating one entry using a fictitious service (http://example.com/analyse) provided by http://example.com. Input: “My ipad is an awesome device”.
{
"@context": [
"http://demos.gsi.dit.upm.es/eurosentiment/static/context.jsonld"
],
"results": {
"analysis": [
{
"@id": "http://example.com/analyse",
"@type": [
"marl:SentimentAnalysis"
],
"dc:language": "en",
"marl:maxPolarityValue": 10.0,
"marl:minPolarityValue": 0.0
"prov:wasAssociatedWith": "http://example.com"
}
],
"entries": [
{
"@id": "http://example.com/analyse?input=My%20ipad%20is%20an%20awesome%20device",
"opinions": [
{
"marl:polarityValue": 9,
"marl:hasPolarity": "marl:Positive",
"marl:describesObject": "http://dbpedia.org/page/IPad"
"prov:generatedBy": "http://example.com/analyse",
}
],
"nif:isString": "My ipad is an awesome device",
"strings": [
{
"@id": "http://example.com/analyse?input=My%20ipad%20is%20an%20awesome%20device#char=3,6",
"nif:anchorOf": "ipad",
"itsrdf:taIdentRef": "http://dbpedia.org/page/IPad"
}
]
}
]
}
}
- Annotating complex emotions in Spanish. Input: “Mi ipad me tiene harto”.
{
"@context": [
"http://demos.gsi.dit.upm.es/eurosentiment/static/context.jsonld"
],
"results": {
"analysis": [
{
"@id": "http://example.com/analyse",
"@type": [
"onyx:EmotionAnalysis"
],
"dc:language": "es",
"onyx:maxEmotionIntensity": 1.0,
"onyx:minEmotionIntensity": 0.0
"prov:wasAssociatedWith": "http://example.com/"
}
],
"entries": [
{
"@id": "http://example.com/analyse?input=Mi%20ipad%20me%20tiene%20harto",
"dc:language": "es",
"opinions": [
],
"emotions": [
{
"onyx:aboutObject": "http://dbpedia.org/page/IPad"
"prov:generatedBy": "http://example.com/analyse",
"onyx:hasEmotion": [
{
"onyx:hasEmotionCategory": "wna:dislike",
"onyx:hasEmotionIntensity": 0.7
},
{
"onyx:hasEmotionCategory": "wna:despair",
"onyx:hasEmotionIntensity": 0.1
}
]
}
],
"nif:isString": "My ipad is an awesome device",
"prov:generatedBy": "http://example.com/analyse",
"strings": [
{
"@id": "http://example.com/analyse?input=Mi%20ipad%20me%20tiene%20harto#char=3,6",
"nif:anchorOf": "ipad",
"itsrdf:taIdentRef": "http://dbpedia.org/page/IPad"
}
]
}
]
}
}
Other serialisation formats¶
The Eurosentiment format is semantic, as is the NIF Format Althought the preferred and mainly used serialisation format is JSON-LD, there are other serialisation formats that could be used as well.
For instance, it is particularly interesting to convert corpora to N-Triples for storage in a semantic server such as Virtuoso.
Useful links¶
NIF: | http://persistence.uni-leipzig.org/nlp2rdf/ |
---|---|
JSON-LD: | http://json-ld.org |
The Eurosentiment Format for Lexica¶
The EUROSENTIMENT sentiment lexicons are represented in RDF using the lemon format extended with some properties from the Marl vocabulary for representing the sentiment information such as polarity and polarity value. Deliverable D4.3 (p54-56) describes in details the model.
The following figure shows an example of RDF domain-specific lexicon.

In this figure we see snippets from two lexicons: a lexicon for the hotel domain in english (i.e. le:hotel ) and a lexicon for the hotel domain in german (i.e. ld:hotel). The sentiment lexicons are composed of lexical entries: in our example lee:location and lee:pretty-good for the English lexicon and led:Lage for the german lexicon. Some of the lexical entries are domain aspects like location and Lage and some are sentiment words like pretty good.
Each lexical entry is defined by a lemon canonicalForm, several lemon otherForm properties representing the morphological variations, several senses and part of speech information. The connection of the EUROSENITMENT lexicons to other linguistic linked data datasets (i.e. DBpedia and Wordnet) happen at the sense level. For instance in our case, the sens of the lee:location lexical entry is linked to the http://dbpedia.org/page/Location entity and to the Wordnet synset: http://wordnet-rdf.princeton.edu/wn31/100027365-n.
In the case of sentiment lexical entries we use two more properties (i.e. marl:PolarityValue and marl:hasPolarity) to represent the sentiment inforation. In our example, lee:pretty-good has a positive polarity of 0.75 where the most positive value is 1 and the most negative value is -1.
The domain-specific aspect of the EUROSENTIMENT lexicons is given by the lemon property context which connectsa sentiment word to a domain aspect. In the example, the lmn:context property signifies that the sentiment word pretty good has a positive value of 0.75 in the context of the location domain aspect.
We recall from D4.3 that we generate domain-specific lexicons in other languages based on the initial english lexicon. For example, in our case, the german lexicon for hotel was automatically generated form the english lexicon for hotel. This relation between the two lexicons is represneted via the isocat:translationOf relation between the senses of the translated lexical entries.
Tools¶
Eurosentiment Corpus Converter¶
Introduction¶
The purpose of the EUROSENTIMENT Corpus Converter is to translate information from legacy or non-semantic formats to the semantic formats used in EUROSENTIMENT.
The Corpus Converter has been tested with the corpora for sentiment and emotion analysis from the members of the consortium (mainly from Expert System and Paradigma Tecnológico). These corpora have been transformed to JSON-LD, using the Marl and Onyx vocabularies. Nevertheless, the Corpus Converter was designed as a generic tool to translate from and to a wide range of formats and vocabularies or ontologies.
Translating a document does not require any technical qualification. The translation of documents can be done through a web portal. This is especially useful for demonstration and testing purposes or quick translations. However, in a real life scenario, there will be a big amount of information and files to be translated. Also, it should be possible to integrate our Corpus Converter in a pipeline or automated process, so that new content can be automatically converted. For these reasons, the Corpus converter also exposes a REST API. It only takes a POST request to translate a document.
The Corpus Converter has been designed to be extensible and to separate the technical aspects from the content and formats being translated. Our Corpus Converter itself is a convenient platform, but the actual translation is performed following a set of “Translation Templates”. These templates have access to the data in the original file, and determine the result of the translation.
An administrative web interface has been developed to make it easy to add new formats, improve translation templates or access the usage statistics.
Architecture¶
As can be seen in the figure below, the Corpus Converter is made out of A Web Server proxies all the user requests to the Request Processor. This processor is a Django application that has two roles: providing a REST and an Administrative interface. On the administrative side it deals with authorisation and authentication of users. It also stores the translation templates provided by the staff in the database On the REST side, it forwards all valid user translation requests to the Translator module. This module opens the original file in any of the supported formats, it applies the requested translation template with its content, and then returns the resulting document.

Translation Templates¶
For convenience, we used an easy-to-use templating language and engine for the Corpus Converter called Jinja2. Despite its user-friendliness, it is a really powerful language. It features advanced loops, conditional clauses, functions and filters. It is essentially a stripped-down or subset of Python.
The file contents can be accessed from within the template like a stream or iterator. The Translator reads the original document, feeding it line by line (for plain text) or row by row (in spreadsheet files). Then, using Jinja2, it is easy to iterate over it and extract all the relevant information.
In addition to all the basic filters in Jinja2, the EUROSENTIMENT templates can also use a set of specific filters that make it easier to tokenise the input.
The annex contains a complete template.
Supported Formats¶
As of this writing, the EUROSENTIMENT Corpus Converter accepts corpora in the following formats:
- Paradigma Tecnológico’s Human Annotated Corpora (Tab-Separated-Values)
- Paradigma Tecnológico’s Machine Annotated Corpora (Tab-Separated-Values)
- Paradigma Tecnológico’s Synset-Aligned Corpora (Tab-Separated-Values)
- Expert System’s Machine Annotated Corpora (Microsoft Excel)
- Expert System’s Human Annotated Corpora (Microsoft Excel)
- Expert System’s Corpora with Emotions (Microsoft Excel)
- TripAdvisor corpora (Raw text with custom tags)
Usage¶
Translating a document¶
Documents can be translated via the Web Interface or using the REST interface. Actually, the form in the Web is simply a convenient way of accessing the REST interface which shows all the available templates and a field to upload the desired file.

Translating a document through the web
The Corpus Converter endpoint takes the following parameters:
- input (i): The original file to be translated
- informat (f): The format of the original file
- intype (t) [Optional]:
- direct (default)
- url
- file
- outformat (o):
- json-ld
- rdfxml
- turtle (default, to comply with NIF)
- ntriples
- trix
- base URI (u) [Optional]: base URI to use for the corpus
- prefix (p) [Optional]: prefix to replace the base URI
- template (t) [Optional]: ID of the template to use. If it is omitted, a template to convert from informat to outformat will be used, or a template from informat to another format (e.g. json-ld), with automatic conversion.
- toFile [Optional]: Whether the result should be sent in the response (default) or written to a file. For convenience, this value defaults to False when using the Web Form.
Using the command line tool curl, a request can be made like this:
curl http://demos.gsi.dit.upm.es/eurosentiment/marlgenerator/process -F"intype=file" -F"informat=Example" -F"outformat=jsonld" -F"input=@input-file.csv" > result.jsonld
Adding a template¶
Editing a template is simple. First, visit the administration URL. If it is your first login or if your session expired, you will be greeted by a login screen:

Login prompt
Just enter your username and password, and the administration interface should appear.

Administration Interface

Editing a template
It is also possible to add a format from this menu, clicking on the “Plus” icon:

Adding a format on the fly
Checking usage statistics¶
Once logged in as a superuser, you can also add new users and check the requests that have been made for each format.

Superuser panel
To check the requests, click on “Translation Requests” in the administration panel.

Log of requests
In addition to simply checking the requests, it is also possible to filter the requests using different parameters. This feature is especially useful if you want to study the popularity of a format, or to compare different templates for the same formats.

Filtering requests
Example Template¶
{
"@context": [
"http://demos.gsi.dit.upm.es/eurosentiment/static/context.jsonld",
],
"@id": "{{ linesplit(f.name,"/")[-1] }}",
"analysis": [
{
"@id": "{{ linesplit(f.name,"/")[-1] }}#MachineAnnotated",
"@type": [
"marl:SentimentAnalysis"
],
{% if language %}
"dc:language": "{{ language}}",
{% endif %}
"marl:maxPolarityValue": 10.0,
"marl:minPolarityValue": 0.0,
"prov:wasAssociatedWith": "pt:agent"
}
],
"entries": [
{% for line in f %}
{% set i=linesplit(line, "\t") %}
{% set node="_:BlankNode%s" % loop.index %}
{% set text = i[0] %}
{% set syntax=linesplit(i[1][1:-1], ",") %}
{% set pol= i[2] | float %}
{
"@id": "{{ node }}",
"opinions": [
{
{% if pol%}
"marl:polarityValue": {{ pol }},
{% if pol > 5 %}
"marl:hasPolarity": "marl:Positive"
{% elif pol < 5 %}
"marl:hasPolarity": "marl:Negative"
{% else %}
"marl:hasPolarity": "marl:Neutral"
{% endif %}
{% endif %}
}
],
"nif:isString": {{ text | escapejs }},
"strings": [
{% for s in syntax %}
{
{% set parts=linesplit(s, ";;") %}
"nif:anchorOf": {{ parts[0] | escapejs }},
"nif:posTag": "pt:{{ parts[1] }}",
"nif:lemma": {{ parts[2] | escapejs }} }
{% if not loop.last %}, {% endif %}{% endfor %}
]
} {% if not loop.last %} , {% endif %}
{% endfor%}
]
}
Eurosentiment Playground¶
EUROSENTIMENT provides services and resources for Sentiment Analysis in several languages. There are several utilities, code snippets and instructions on how to make use of the platform publicly available. However, all of them require the installation of a third party tool or the use of a programming language to consume the API. The EUROSENTIMENT Playground solves this problem by providing an easy-to-use web interface to make API calls. Read our simple instructions and start using EUROSENTIMENT today!
The playground is available here: http://demos.gsi.dit.upm.es/eurosentiment-playground/
Language Resource Adaptation Pipeline¶
In this section we describe the Language Resource Adaptation Pipeline various components and provides links to the source code for these components.
The Language Resource Adaptation Pipeline (a.ka. LRAP) implements a methodology for legacy language resource adaptation that generates domain-specific sentiment lexicons organized around domain entities described with lexical information and sentiment words described in the context of these entities.
The outcome of the Language Resource Adaptation Pipeline are annotated corpora represented the NIF/Marl format and domain-specific sentiment lexicons represented in RDF using the Lemon/Marl format. The legacy language resources are enriched with semantics and additional linguistic information from resources like DBpedia and BabelNet.
There are four main steps of the LRAP as shown in the Figure bellow:

- Corpus Conversion: normalizes the different language resources to a common schema based on Marl and NIF; The corpus convertor tool was described earlier in a separate section.
- Semantic Analysis: extracts the domain-specific entity classes and named entities and identifies links between these entities and concepts from the LLOD Cloud. The Semantic Analysis step consists of: Domain Modeller (DM), Entity Extraction (EE), Entity Linking (EL) and Synset Identification (SI) components.
- Sentiment Analysis: extracts contextual sentiments and identifies SentiWordNet synsets corresponding to these contextual sentiment words. The Sentiment Analysis step consists of: Domain-Specific Sentiment Polarity Analysis (DSSA) and Sentiment Synset Identification (SSI) components.
- Lexicon Generator: uses the results of the previous steps, enhances them with multilingual and morphosyntactic information and converts the results into a lexicon based on the lemon and Marl formats. The Lexicon Generator step consists of: MorphoSyntactic Enrichment (ME), Machine Translation(T) and lemon/Marl Generator(LG) components.
Different language resources are processed with variations of the given adaptation pipeline. For more details on the LRAP and the domain-specific lexicons it generates please check our dissemination material:
- Presentatin at 5th International Workshop on EMOTION, SOCIAL SIGNALS, SENTIMENT & LINKED OPEN DATA: “Generating Linked-Data based Domain-Specific Sentiment Lexicons from Legacy Language and Semantic Resources” - Gabriela Vulcu, Paul Buitelaar, Sapna Negi, Bianca Pereira, Mihael Arcan, Barry Coughlan, Fernando J. Sanchez and Carlos A. Iglesias
- Poster at the Data Challenge at the 3rd Workshop on Linked Data in Linguistics “Linked-Data based Domain-Specific Sentiment Lexicons” - Gabriela Vulcu, Raul Lario Monje, Mario Munoz, Paul Buitelaar and Carlos A. Iglesias
Services and Resources¶
Services¶
The Eurosentiment Portal offers a series of services that are useful for Sentiment Analysis such as entity recognition or domain detection, as well as sentiment and emotion analysis themselves.
How to add a new service¶
This section describes how a new service is added to the EUROSENTIMENT LRP.
- Service upload steps:
Precondition: the user is logged in: https://portal.eurosentiment.eu/ with his username and password
Click on the ‘Services’ tab -> ‘Add a service’
- Fill in the ‘Service creation form’
- Name -> give a name of your service
- Description -> add a detailed description of your service
- HTTP method -> select the HTTP method and fill in the service’s access URI
- Credentials -> if the serviec endpoit is not public, please provide an access token
- Request cost -> fill in the cost per request to your service
- Request limit -> fill in teh limit of requests a user of the service should not exceed
- Language -> select form the language drop-down list
- Domain -> select form the drop-down list
- Fill in the Fact sheet with similar details and send an email to eurosentimentpt@gmail.com
Click on the ‘Create’ button
NIF API¶
- GET /services/server/access/(service_id)¶
Access the service at service_url. The service_id can be retrieved from the service page in the Portal Since the requests to the server are likely to be long, POST /services/server/access/(service_id) is recommended.
Example request:
GET /service/access/SentimentAnalysisExample?input=I%20love%20EUROSENTIMENT HTTP/1.1 Host: eurosentiment.eu Accept: application/json, text/javascript
Example response:
HTTP/1.1 200 OK Vary: Accept Content-Type: text/javascript { "@context": [ "http://eurosentiment.eu/context.jsonld", { "@base": "http://eurosentiment.eu/service/access/SentimentAnalysisExample#" } ], "analysis": [ { "@id": "SentimentAnalysisExample", "@type": "marl:SentimentAnalysis", "dc:language": "en", "marl:maxPolarityValue": 10.0, "marl:minPolarityValue": 0.0 } ], "domain": "wndomains:electronics", "entries": [ { "opinions": [ { "prov:generatedBy": "SentimentAnalysisExample", "marl:polarityValue": 7.8, "marl:hasPolarity": "marl:Positive", "marl:describesObject": "http://eurosentiment.eu", } ], "nif:isString": "I love EUROSENTIMENT", "strings": [ { "nif:anchorOf": "EUROSENTIMENT", "nif:taIdentRef": "http://eurosentiment.eu" } ] } ] }
Query Parameters: - input (i) – No default. Depends on informat and intype
- informat (f) – one of turtle (default), text, json-ld
- intype (t) – one of direct (default), url
- outformat (o) – one of turtle (default), text, json-ld
- prefix (p) – prefix for the URIs
Request Headers: - Accept – the response content type depends on Accept header
- X-Eurosentiment-Token – optional OAuth token to authenticate
Response Headers: - Content-Type – this depends on Accept header of request
Status Codes: - 200 OK – no error
- 404 Not Found – service not found
- POST /services/server/access/(service_id)¶
The same as the previous method. This is the recommended method.
Form Parameters: - i/input – No default. Depends on informat and intype
- f/informat – one of turtle (default), text, json-ld
- t/intype – one of direct (default), url
- o/outformat – one of turtle (default), text, json-ld
- p/prefix – prefix for the URIs
Request Headers: - Accept – the response content type depends on Accept header
- X-Eurosentiment-Token – optional OAuth token to authenticate
Response Headers: - Content-Type – this depends on Accept header of request
Status Codes: - 200 OK – no error
- 404 Not Found – service not found
Create your own services¶
If, instead of using any of the provided services you want to roll your own, you can also contribute to the Eurosentiment pool by publishing your service. The Eurosentiment github repository contains a series of tutorials that will help you get started with the process, including some complete examples of sentiment analysis service in different programming languages.
As of this writing, there are examples in:
Resources¶
How to add a new language resource¶
This section describes how a new language resource is added to the EUROSENTIMENT LRP. You can also watch a video (https://www.dropbox.com/s/86dqo4u4k9gf0i6/EUROSENTIMENT-adding-a-new-LR.mp4?dl=0) that describes step-by-step the process.
- Resource upload steps:
Precondition: the user is logged in: https://portal.eurosentiment.eu/ with his username and password
Click on the ‘Recources’ tab -> ‘Add a resource’
- Fill in the ‘Resource creation form’
- Name -> give a name of your LR
- Graph URI prefix -> the platform will suggest you a unique graph URI (if you like it, leave it like it is)
- Short Name -> give a short name of your LR. It will be used to dynamically generate the graph URI
- Language -> select form the language drop-down list
- Application Domain -> select form the drop-down list
- Description -> add a detailed description of your language resource.
- Resource type -> chose from the drop-down list the resource type you want to add.
- Access control -> select the type of license under which you want to publish the LR
- Upload the LR file
- Fill in the Fact sheet with similar details and send an email to eurosentimentpt@gmail.com
Click on the ‘Create’ button
- What will happen behind the scenes:
- an email is sent to the EUROSENTIMENT LRP
- you are notified by email that your LR was submitted and will be reviewed before added to the LRP as soon as possible
- an administrator will carefully read the resource submission and will decide if the language resource will be added to the language resource pool
- if the LR is provided in an existing format then move to 4.6
- if the LR is provided in a new format not known previously to the platform, the administrator will develop a specific language resource adaptation pipeline
- the administrator runs the language resource adaptation pipeline and the provided LR is processed, converted to RDF and linked-data and the result is uploaded to the SPARQL endpoint
- you are notified by email that your language resource was processed
Click on the ‘Resources’ -> ‘List Own resources’ and see your newly added LR that can be used by you or other users of the EUROSENTIMENT LRP
Resources API¶
- POST /sparql¶
The posts tagged with tag that the user (user_id) wrote.
Example request:
POST /sparql HTTP/1.1 Host: eurosentiment.eu Content-Length: 199 x-eurosentiment-token: 23aee871-d18d-4afa-c2e3-283f8ae9232ca Accept-Encoding: gzip, deflate, compress Accept: */* content-type: application/json {"query": "PREFIX lemon: <http://lemon-model.net/lemon#>\nSELECT * FROM <http://www.eurosentiment.eu/dataset/hotel/es/gabvul/9/lexicon>\nWHERE {?s lemon:sense ?sense }", "format": "application/json"}
Example response:
HTTP/1.1 200 OK date: Mon, 07 Jul 2014 16:46:21 GMT content-length: 596 content-type: application/json server: Jetty(8.1.10.v20130312) { "head": { "link": [], "vars": [ "s", "sense" ] }, "results": { "bindings": [ { "s": { "type": "uri", "value": "http://www.eurosentiment.eu/dataset/hotel/es/gabvul/9/lexicalentry/H2O" }, "sense": { "type": "uri", "value": "http://www.eurosentiment.eu/dataset/hotel/es/gabvul/9/lexicalentry/sense/H2O_0" } }, { "s": { "type": "uri", "value": "http://www.eurosentiment.eu/dataset/hotel/es/gabvul/9/lexicalentry/a_gusto" }, "sense": { "type": "uri", "value": "http://www.eurosentiment.eu/dataset/hotel/es/gabvul/9/lexicalentry/sense/a_gusto_0" } } ], "distinct": false, "ordered": true } }
:data :reqheader Accept: the response content type depends on
Accept headerRequest Headers: - Authorization – optional OAuth token to authenticate
Response Headers: - Content-Type – this depends on Accept header of request
Status Codes: - 200 OK – no error
- 404 Not Found – there’s no user
Language Resource Types¶
This section describes the type of language resources that the EUROSENTIMENT LRP supports. Based on the initial list of language resources (see here) from the project we identified the following supported formats. please not in brackets the number of the Lnagua resource addressed by the listed format.
1. domain-specific-lexicon-TSV (5): Tab-separated-values file that describes sentiments in the context of domain aspects (e.g. myFile.tsv). The header of the TSV file should have the following columns:
entityWNid entityPOS entity sentiWNid sentiPOS sentiment sentiScore
where:
- entityWNid: WordNet30 synset ID of the domain aspect (e.g. 02671062).
- entityPOS: part of speech of the domain aspect (i.e. n, a, v, r).
- entity: domain aspect as string (e.g. “access”).
- sentiWNid: WordNet30 synset ID of the sentiment associated with the domain aspect (00979366).
- sentiPOS: part of speech of the sentiment word (i.e. n, a, v, r).
- sentiment: sentiment word as string (e.g. “quick”)
- sentiScore: the polarity value of the sentiment (a rational number between -1 and 1, where -1 is very negative, 0 is neutral and 1 is very positive).
e.g.
02671062 n access 00979366 a quick 0.5
2. entiment-lexicon-CSV (45 (needs to be converted), 54): Comma-separated-values file that describes the sentiment words and their polarities from a domain. (e.g. myFile.csv). The header of the CSV file should have the following columns:
sentiment,sentimentPOS,sentimentScore,morphosyntacticVariations
where:
- sentiment: the sentiment word in the domain.
- sentimentPOS: the part-of-speech of the sentiment word.
- sentimentScore: the polarity value of the sentiment (a rational number between -1 and 1, where -1 is very negative, 0 is neutral and 1 is very positive).
- morphosyntacticVariations:the sentiment morphosyntactic variations.
e.g.
Besserung|NN 0.40 Besserungen,Besserungnen
- review-corpus-no-polarity (36): A file containing one review per line.
e.g.
The location was great and the staff friendly. I like it!
The room was a bit too small.
…
- review-corpus-overall-polarity-TSV (3, 31, 34): A tab-separated-values file with reviews and overall polarity. The header of the TSV file should have the following columns:
reviewText overallPolarity
where:
- reviewText: A string that contains the review (no tabs in the string).
- overallPolarity: The overall polarity of the given review text (rational number between -1 and 1, where -1 is very negative, 0 is neutral and 1 is very positive).
e.g.
Rien à dire. Très bon produit de qualité. 1.0
- review-corpus-pos-lemma-wn-overall-polarity-TSV (2, 4): A tab-separated-values file with columns of the following form:
reviewText annotation overallPolarity
where:
- reviewText: A string that contains the review (no tabs in the string).
- annotation: A list with comma-separated values for each word in the text review containing: word, part-of-speech, lemma and a list of possible WordNet30 synsteIDs.
- overallPolarity: The overall polarity of the given review text (rational number between -1 and 1, where -1 is very negative, 0 is neutral and 1 is very positive).
e.g.
Excellent location. [Excellent;;JJ;;excellent;;[], location;;NN;;location;;[n#01051331]] 0.8
- review-corpus-pos-lemma-wn-overall-polarity-Excel (7,8,9,10,11,12,13,14,17,18,20,21,22,50,51,52,53): A .gz compressed folder containing Excel files with several reviews per file. Each review in the Excel file is spread over several lines. The header of the Excel file is: TEXT LEMMA WN_POS WN_SYNSET DOMAIN SENTIMENT EMOTION.
where:
- TEXT: the full review text in the first line; subsequent lines have one word of the review per line.
- LEMMA: nothing in the first line; in subsequent lines it describes the lemma value of an individual word from the review.
- WN_POS: nothing in the first line; in subsequent lines it describes the part-of-speech value of an individual word from the review.
- WN_SYNSET: nothing in the first line; in subsequent lines it describes the WordNet30 synset ID value of an individual word from the review.
- DOMAIN: the domain of the review (only in the first line)
- SENTIMENT: (only in the first line)
- EMOTION: (only in the first line)
- dataset-rdf (25, 26, 27, 29, 37(needs conversion to RDF),42,55): RDF dump (*.nt.gz) of linked data dataset like WordNet, DBpedia, BabelNet.
- aspects-review-corpus-TripAdvisor (49): A file with annotated reviews and aspect ratings. Each review in the file is spread over several lines where each line starts with a dedicated tag as in the example below.
e.g.
<Author>IndieLady
<Content>Lovely hotel, unique decor, friendly front desk staff […]
<Date>Nov 13, 2008
<No. Reader>-1
<No. Helpful>-1
<Overall>4
<Value>5
<Rooms>4
<Location>5
<Cleanliness>4
<Check in / front desk>5
<Service>5
<Business service>-1
- aspects-review-corpus-Amazon (44): A file that consists of plain text reviews for products with custom ratings annotations that spread over several lines. The marker for a new review is [t] whereas the numbers in brackets stand for the rating of a certain aspect in the review. See below an example:
e.g.
[ t ] the best 4mp compact digital available camera[+2]## this camera is perfect for an enthusiastic amateur photographer . picture [+3] ,
macro[+3]## the pictures are razor sharp , even in macro . . .
- Opener lexicon : Semicolon-separated-values file with the following columns:
wordnetSynsetID; POStag; polarity; confidence; lemmas; manualReviewFlag
where:
- wordnetSynsetID : wordnet 30 synsetID.
- POStag : part-of-speech tag
- polarity : sentiment polarity which can be -1, 0 or 1 for negative, neutral and positive respectively.
- confidence : confidence assigned by the propagation algorithm
- lemmas : lemmas of this synset in wordnet separated by comma
- manualReviewFlag : -1 if no manual review has been done and + 1 if they have been reviewed.
e.g.
eng-30-09366317-n;n;positive;0.3125;natural_elevation,elevation;-1
eng-30-07961016-n;n;neutral;0.3125;clod,glob,ball,chunk,clump,lump;-1
Language Resource Pool Management¶
The LRP Management Application API¶
It is possible to retrieve management information from the LRP such as listing a user’s own services or the subscription details.
The basic endpoint is: http://217.26.90.243:8080/EuroSentimentServices/services/server
- GET /listAll¶
Get a list of all the services in the platform.
Example request:
GET /listAll Host: portal.eurosentiment.eu Accept: application/json, text/javascript
Example response:
HTTP/1.1 200 OK Vary: Accept Content-Type: text/javascript [ { "credentials": "", "lastModification": "2014-06-30 10:24:47.0", "request_limit": 10000, "serviceMethod": "POST", "serviceUrl": "http://54.187.254.3:8000/language_detector", "sid": "sptdl0407", "state": "enabled", "url": "http://217.26.90.243:8080/EuroSentimentServices/services/server/access/sptdl0407" } ]
Status Codes: - 200 OK – no error
- 404 Not Found – there’s no user
Todo
Describe the LRPMA API in detail
FAQ¶
Frequently Asked Questions¶
- What is EUROSENTIMENT?
EUROSENTIMENT is a platform for the distribution of multilingual sentiment analysis resources available online for the first time.
- How does it work?
The EUROSENTIMENT platform works as a hub where different players can sell and buy sentiment analysis resources and services.
- What are the benefits in using EUROSENTIMENT?
EUROSENTIMENT is a comprehensive market to implement a fully featured service.
- Which kind of roles the platform support?
The platform supports Service Providers (anybody selling a service), Resource Providers (anybody selling a linguistic resource), Content Providers (anybody selling contents)and Consumers (anybody buying services, resources and contents).
- How EUROSENTIMENT can help me developing my business?
EUROSENTIMENT is the common place where demand and supply meet to provide the critical mass for boosting your activity.
- How shall I earn from EUROSENTIMENT?
As sentiment analysis solution implementer you can sell your services and also the resources they are based on.
- What about if I’m a researcher?
EUROSENTIMENT foreseen license and access facilitations for academic usage.
- Shall I benefit in using the EUROSENTIMENT resources for my research activity?
EUROSENTIMENT aims to distribute the best resources available on the market, providing to your research the state of the art tools to validate and extend your results.
- Which type of billing services the EUROSENTIMENT platform does support?
The accessed resources will be billed per download volume, the services for processing volume.
- Does the platform support an API?
The EUROSENTIMENT platform provides a fully featured REST based API.
- Which kind of domains are supported?
Currently we support the Electronics and Hotels domains.
- Which languages are supported?
Currently we support English, Spanish, Italian, French, Catalan and Portuguese.
- Shall I test the EUROSENTIMENT services?
The EUROSENTIMENT platform provides a demo area which allows to test all available resources.
- Which kind of license restrictions are supported/applied?
The licenses are defined by the resource owners and will follow their specific policies.
- Which kind of security standards does it offer?
The platform provides standard HTTPS certificate-based security verifications. Registered users are restricted to configurable group operational grants.
- What can I do as Service Provider?
As a Service Provider I can register my already implemented service to the platform, which will provide user accounting, billing and tracking features.
- What can I do as Language Resource Owner?
As Language Resource Owner I can upload my resources for public download previous payment and licence acceptance.
- What can I do as Content Provider?
As Content Provider I can upload my resources for public download previous payment and licence acceptance.
Indices and tables¶
To-Do¶
Todo
Describe the LRPMA API in detail
(The original entry is located in /var/build/user_builds/eurosentiment/checkouts/latest/lrpma/api.rst, line 43.)