OLAC Record
oai:catalogue.elra.info:ELRA-W0074

Metadata
Title:Amharic-English bilingual corpus
Abstract:The Amharic-English bilingual corpus contains parallel text from legal and news domains in Amharic script, in transliterated form and in English. The size of the corpus is of 232,653 words in Amharic and 291,701 in English.
Access Rights:Rights available for: Commercial Use, Research Use
Date Available (W3CDTF):2013-12-17
Date Issued (W3CDTF):2013-12-17
Date Modified (W3CDTF):2013-12-17
Description:Written Corpora
The Amharic-English bilingual corpus contains parallel text from legal and news domains in Amharic script, in transliterated form and in English. The size of the corpus is of 232,653 words in Amharic and 291,701 in English. This parallel corpus contains documents from two domains, namely legal and news, in English and Amharic language. The two domains are separately processed. In addition, for Amharic language, documents were prepared using its own script which is different from Latin alphabet. For easy of use and processing, as well as normalization purposes, the Amharic documents are transliterated and the English documents are converted into lower case format. Furthermore, clean documents were prepared without considering the two domains separately. Amharic is a Semitic language spoken in Ethiopia.
Identifier:ELRA-W0074
http://catalog.elra.info/product_info.php?products_id=1215
Language:Amharic
English
Language (ISO639):amh
eng
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0074
DateStamp:  2013-12-17
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2013. ELRA (European Language Resources Association).
Terms: area_Africa area_Europe country_ET country_GB dcmi_Text iso639_amh iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0074
Up-to-date as of: Sun Nov 12 1:45:34 EST 2017