OLAC Record
oai:catalogue.elra.info:ELRA-W0022

Metadata
Title:ILSP/ELEFTHEROTYPIA Corpus (Greek corpus)
Abstract:This corpus contains approximately 3 million words from the daily newspaper ELEFTHEROTYPIA, classified and annotated accordingly to the common core PAROLE encoding standard. The format of the corpus is SGML files. A subset of the corpus (250,000 words) is morpho-syntactically tagged; all the words are also lemmatised and checked.
Access Rights:Rights available for: Research Use
Date Available (W3CDTF):2000-03-09
Date Issued (W3CDTF):2004-05-12
Date Modified (W3CDTF):2004-05-12
Description:Written Corpora
The ILSP/ELEFTHEROTYPIA Corpus contains approximately 3 million words classified and annotated according to the common core PAROLE encoding standard. Thus, each file is classified according to the parameters of Medium, Topic and Genre, and structurally annotated at paragraph level (CES Level 1). The format of the corpus is SGML files. The source of the files is the Greek daily newspaper ELEFTHEROTYPIA. A subset of the corpus (250,000 words) is morpho-syntactically tagged; all the words are also lemmatised and checked. For the morphosyntactic annotation of the corpus, a stepwise procedure consisting of the following four steps was used: automatic morphosyntactic annotation, automatic disambiguation, manual disambiguation and checking, conversion into the PAROLE format requirements. In certain texts, some passages are written in "katharevoussa", an older version of Greek; these passages are marked as "distinct" and have not been morpho-syntactically annotated. The tagset used for the morphological annotation of the corpus is presented in the "Addendum to TA - Encoding features and values for the morphological layer in the lexicon Merged Tags" (P-WP1.1.-MEMO-ERLI-5). More information about the PAROLE project: http://www.elda.org/catalogue/fr/text/doc/parole.html
Identifier:ELRA-W0022
http://catalog.elra.info/product_info.php?products_id=763
Language:Modern Greek (1453-); Greek, Modern (1453-)
Language (ISO639):ell
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0022
DateStamp:  2000-03-09
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2004. ELRA (European Language Resources Association).
Terms: area_Europe country_GR dcmi_Text iso639_ell olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0022
Up-to-date as of: Mon Oct 9 1:51:44 EDT 2017