OLAC Record

Title:SoNaR New Media Corpus
Bibliographic Citation:http://hdl.handle.net/11372/LRT-1502
Creator:Radboud University, CLST
Tilburg University, ILK
University of Twente, HMI
University College Ghent, Faculty of Translation Studies
KU Leuven, CCL
Utrecht University, UiL OTS
Date (W3CDTF):2015-06-29T13:24:41Z
Date Available:2015-06-29T13:24:41Z
Description:The SoNaR New Media Corpus contains approx. 35 million words and consists of tweets, chats and sms (the SoNaR text categories WR-P-E-L_tweets, WR-U-E-A_chats, WR-U-E-D_sms.) All texts have been automatically tokenized, tagged for part of speech and lemmatized.
Identifier (URI):http://hdl.handle.net/11372/LRT-1502
Language (ISO639):nld
Publisher:Dutch-Flemish HLT Agency
Subject:monolingual corpus
annotated corpus
new media
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11372/LRT-1502
DateStamp:  2016-04-06
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Radboud University, CLST; Tilburg University, ILK; University of Twente, HMI; University College Ghent, Faculty of Translation Studies; KU Leuven, CCL; Utrecht University, UiL OTS. 2015. Dutch-Flemish HLT Agency.
Terms: area_Europe country_NL dcmi_Text iso639_nld olac_primary_text

Up-to-date as of: Sun Jul 22 0:24:53 EDT 2018