OLAC Record
oai:www.clarin.si:11356/1043

Metadata
Title:MULTEXT-East "1984" annotated corpus 4.0
Bibliographic Citation:http://hdl.handle.net/11356/1043
Creator:Erjavec, Tomaž
Barbu, Ana-Maria
Derzhanski, Ivan
Dimitrova, Ludmila
Garabík, Radovan
Ide, Nancy
Kaalep, Heiki-Jaan
Kotsyba, Natalia
Krstev, Cvetana
Oravecz, Csaba
Petkevič, Vladimír
Priest-Dorman, Greg
QasemiZadeh, Behrang
Radziszewski, Adam
Simov, Kiril
Tufiş, Dan
Zdravkova, Katerina
Date (W3CDTF):2015-06-15T08:51:55Z
Date Available:2015-06-15T08:51:55Z
Description:The novel "1984" by George Orwell is the central component of the MULTEXT-East corpus. This parallel and sentence aligned corpus contains the novel in the English original (about 100,000 words in length), and its translations into a number of languages. This version of the corpus contains the linguistically annotated texts, with each word tagged by its lemma and its MULTEXT(-East) morphosyntactic description (MSD, i.e., a fine-grained feature-structure based PoS tag). The structurally annotated texts are a separate submission (http://hdl.handle.net/11356/1044), also with somewhat different languages.
Identifier (URI):http://hdl.handle.net/11356/1043
Language:Bulgarian
Czech
English
Estonian
Persian
Hungarian
Macedonian
Polish
Romanian
Slovak
Slovenian
Serbian
Language (ISO639):bul
ces
eng
est
fas
hun
mkd
pol
ron
slk
slv
srp
Publisher:Jožef Stefan Institute
Replaces (URI):http://hdl.handle.net/11372/LRT-675
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
https://creativecommons.org/licenses/by-nc-sa/4.0/
Subject:parallel corpus
tagging
multilingual
Slavic languages
manual annotation
TEI
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  Slovenian language resource repository CLARIN.SI
Description:  http://www.language-archives.org/archive/clarin.si
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.clarin.si:11356/1043
DateStamp:  2017-09-29
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Erjavec, Tomaž; Barbu, Ana-Maria; Derzhanski, Ivan; Dimitrova, Ludmila; Garabík, Radovan; Ide, Nancy; Kaalep, Heiki-Jaan; Kotsyba, Natalia; Krstev, Cvetana; Oravecz, Csaba; Petkevič, Vladimír; Priest-Dorman, Greg; QasemiZadeh, Behrang; Radziszewski, Adam; Simov, Kiril; Tufiş, Dan; Zdravkova, Katerina. 2015. Jožef Stefan Institute.
Terms: area_Europe country_BG country_CZ country_GB country_HU country_MK country_PL country_RO country_RS country_SI country_SK dcmi_Text iso639_bul iso639_ces iso639_eng iso639_est iso639_fas iso639_hun iso639_mkd iso639_pol iso639_ron iso639_slk iso639_slv iso639_srp olac_primary_text


http://www.language-archives.org/item.php/oai:www.clarin.si:11356/1043
Up-to-date as of: Tue Aug 20 10:26:53 EDT 2019