OLAC Record
oai:www.clarin.si:11356/1096

Metadata
Title:Serbian Twitter training corpus ReLDI-NormTag-sr 1.0
Bibliographic Citation:http://hdl.handle.net/11356/1096
Creator:Ljubešić, Nikola
Farkaš, Daša
Klubička, Filip
Erjavec, Tomaž
Miličević, Maja
Vuković, Teodora
Date (W3CDTF):2017-04-04T09:10:17Z
Date Available:2017-04-04T09:10:17Z
Description:ReLDI-NormTag-sr 1.0 is a manually annotated corpus of Serbian tweets. It is meant as a gold-standard training and testing dataset for tokenisation, sentence segmentation, word normalisation, morphosyntactic tagging and lemmatisation of non-standard Serbian. Each tweet is also annotated for its automatically assigned standardness levels (T = technical standardness, L = linguistic standardness). The corpus construction is (partially) described in: MILIČEVIĆ, Maja, LJUBEŠIĆ, Nikola. Tviterasi, tviteraši or twitteraši? Producing and analysing a normalised dataset of Croatian and Serbian tweets. Slovenščina 2.0: empirical, applied and interdisciplinary research, 4/2, 2016. ISSN 2335-2736. http://dx.doi.org/10.4312/slo2.0.2016.2.156-188
Identifier (URI):http://hdl.handle.net/11356/1096
Is Replaced By (URI):http://hdl.handle.net/11356/1120
Language:Serbian
Language (ISO639):srp
Publisher:Jožef Stefan Institute
Rights:Creative Commons - Attribution 4.0 International (CC BY 4.0)
https://creativecommons.org/licenses/by/4.0/
Subject:computer-mediated communication
tokenisation
word normalisation
tagging
lemmatisation
manual annotation
TEI
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  Slovenian language resource repository CLARIN.SI
Description:  http://www.language-archives.org/archive/clarin.si
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.clarin.si:11356/1096
DateStamp:  2018-10-18
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip; Erjavec, Tomaž; Miličević, Maja; Vuković, Teodora. 2017. Jožef Stefan Institute.
Terms: area_Europe country_RS dcmi_Text iso639_srp olac_primary_text


http://www.language-archives.org/item.php/oai:www.clarin.si:11356/1096
Up-to-date as of: Mon Jun 10 9:21:40 EDT 2019