OLAC Record
oai:www.clarin.si:11356/1054

Metadata
Title:Twitter sentiment for 15 European languages
Bibliographic Citation:http://hdl.handle.net/11356/1054
Creator:Mozetič, Igor
Grčar, Miha
Smailović, Jasmina
Date (W3CDTF):2016-02-23T10:08:53Z
Date Available:2016-04-25T21:45:18Z
Description:The dataset contains over 1.6 million tweets (tweet IDs), labeled with sentiment by human annotators. There are 15 Twitter corpora for the corresponding 15 European languages. The data can be used to train and evaluate Twitter sentiment classifiers, to compute annotator agreement, or to study the differences between language usage on Twitter. The data analysis is described in the paper: I. Mozetič, M. Grčar, J. Smailović. Multilingual Twitter sentiment classification: The role of human annotators, PLoS ONE 11(5): e0155036, doi: 10.1371/journal.pone.e0155036, 2016. (http://dx.doi.org/10.1371/journal.pone.0155036)
Identifier (URI):http://hdl.handle.net/11356/1054
Language:Albanian
Bosnian
Bulgarian
Croatian
English
German
Hungarian
Polish
Portuguese
Serbian
Russian
Slovak
Slovenian
Spanish
Swedish
Language (ISO639):sqi
bos
bul
hrv
eng
deu
hun
pol
por
srp
rus
slk
slv
spa
swe
Publisher:Jožef Stefan Institute
Rights:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
https://creativecommons.org/licenses/by-sa/4.0/
Subject:sentiment classification
Twitter
inter-annotator agreement
annotator self-agreement
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  Slovenian language resource repository CLARIN.SI
Description:  http://www.language-archives.org/archive/clarin.si
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.clarin.si:11356/1054
DateStamp:  2017-06-27
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Mozetič, Igor; Grčar, Miha; Smailović, Jasmina. 2016. Jožef Stefan Institute.
Terms: area_Europe country_BA country_BG country_DE country_ES country_GB country_HR country_HU country_PL country_PT country_RS country_RU country_SE country_SI country_SK dcmi_Text iso639_bos iso639_bul iso639_deu iso639_eng iso639_hrv iso639_hun iso639_pol iso639_por iso639_rus iso639_slk iso639_slv iso639_spa iso639_sqi iso639_srp iso639_swe olac_primary_text


http://www.language-archives.org/item.php/oai:www.clarin.si:11356/1054
Up-to-date as of: Sun Sep 24 1:36:19 EDT 2017