OLAC Record
oai:catalogue.elra.info:ELRA-W0078

Metadata
Title:NE3L named entities Arabic corpus
Abstract:The Arabic corpus contains 103,363 words coming from articles extracted from ?Le Monde Diplomatique? newspaper, and published in 2004. 2 named entity categories were taken into account: Time and Amount.
Access Rights:Rights available for: Commercial Use, Research Use
Date Available (W3CDTF):2014-09-29
Date Created (W3CDTF):2014-08-01
Date Issued (W3CDTF):2014-09-29
Date Modified (W3CDTF):2014-09-29
Description:Written Corpora
The NE3L project (Named Entities 3 Languages) consisted in annotating several corpora with different languages with named entities. Text format data were extracted from newspapers and deal with various topics. 3 different languages were annotated: Arabic, Chinese and Russian. For this project, 5 named entity categories were taken into account: Person, Place, Organisation, Time and Amount. Each language was concerned only by a subset of these categories, i.e. Arabic was marked up with Time and Amount tags, as well as Russian, whereas Chinese was marked up with Person, Place and Organisation tags. The Arabic corpus contains 103,363 words coming from articles extracted from ?Le Monde Diplomatique? newspaper, and published in 2004.
Identifier:ELRA-W0078
http://catalog.elra.info/product_info.php?products_id=1226
Language:Arabic
Language (ISO639):ara
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0078
DateStamp:  2014-09-29
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2014. ELRA (European Language Resources Association).
Terms: dcmi_Text iso639_ara olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0078
Up-to-date as of: Mon Oct 9 1:53:53 EDT 2017