OLAC Record
oai:catalogue.elra.info:ELRA-W0054

Metadata
Title:Persian 1984 corpus (Multext-East framework)
Abstract:This corpus contains the Persian (Farsi) translation of a part of the novel ?1984? (G. Orwell) annotated in the Multext-East framework (Multilingual Text Tools and Corpora for Eastern and Central European Languages). The corpus contains approximately 100,000 words (6,604 sentences, 13,247 lemmas), with extensive headers and markup for document structure, sentences, and various sub-sentence annotations in the XML-format following the TEI guidelines. Annotation includes POS (part-of-speech) and lemmas.
Access Rights:Rights available for: Research Use, Commercial Use
Date Available (W3CDTF):2010-09-27
Date Issued (W3CDTF):2010-09-27
Date Modified (W3CDTF):2010-09-27
Description:Written Corpora
This corpus contains the Persian (Farsi) translation of a part of the novel ?1984? (G. Orwell) annotated in the Multext-East framework (Multilingual Text Tools and Corpora for Eastern and Central European Languages). The aim of the Multext-East project was to develop standardized language resources. The package comprises: (i) the specifications for morphosyntactic encoding of Persian Language, based on the EAGLES/MULTEXT model and specific resources of MULTEXT-East, (ii) the annotated Persian version of Orwell?s 1984 corpus. The corpus contains extensive headers and markup for document structure, sentences, and various sub-sentence annotations in the XML-format following the TEI guidelines. Annotation includes POS (part-of-speech) and lemmas. The corpus contains approximately 100,000 words (6,604 sentences, 13,247 lemmas) and can easily be aligned with other corpora in the MULTEXT-East framework.
Identifier:ELRA-W0054
http://catalog.elra.info/product_info.php?products_id=1124
Language:Persian
Language (ISO639):fas
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0054
DateStamp:  2010-09-27
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2010. ELRA (European Language Resources Association).
Terms: dcmi_Text iso639_fas olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0054
Up-to-date as of: Mon Oct 9 1:53:10 EDT 2017