OLAC Record
oai:catalogue.elra.info:ELRA-E0040

Metadata
Title:MEDAR Evaluation Package
Abstract:The MEDAR Evaluation Package was produced within the project MEDAR (MEDiterranean ARabic language and speech technology), supported by the European Commission's ICT programme. It aims to enable the evaluation of SLT /MT (Machine Translation) systems for translation tasks applying to the English-to-Arabic direction.
Access Rights:Rights available for: Evaluation Use
Date Available (W3CDTF):2012-03-28
Date Issued (W3CDTF):2012-03-28
Date Modified (W3CDTF):2012-03-29
Description:Written Corpora
The MEDAR Evaluation Package was produced within the project MEDAR (MEDiterranean ARabic language and speech technology), supported by the European Commission's ICT programme and which has been running from February 1st 2008 until July 31st 2010. The project addressed International Cooperation between the European Union and the Mediterranean region on Speech and Language Technologies (SLT) for Arabic. This evaluation package aims to enable the evaluation of SLT/MT (Machine Translation) systems for translation tasks applying to the English-to-Arabic direction. The package consists of two SMT baseline systems and all necessary resources for the evaluation of machine translation for the English-to-Arabic direction. The package reflects the outcome of the dry-run and evaluation campaign carried out in February and July 2010. It contains the training and test data, the reference translations, documentation and tools that enable to score a system output. Tools are split in four categories: - An alignment package including Hunalign, Champollion Tool Kit (CTK) and formatting scripts - Evaluation metrics to evaluate MT output against reference translations: BLEU/NIST and WER - Formatting scripts to convert XML files to raw in keeping tag information so as to realize the back conversion - Two MT baseline systems that use MOSES (see http://www.statmt.org/moses for more information about MOSES) Data concerns: - the results of the morphosyntactic disambiguation and sentence and word alignment on the English-Arabic parallel corpus of the dry-run - the source corpora and their reference translations used for the MEDAR dry-run and evaluation campaign - the monolingual and parallel training data to train MT systems - the judge assessments of the dry-run and campaign human evaluations The full package is stored on 1 DVD.
Identifier:ELRA-E0040
http://catalog.elra.info/product_info.php?products_id=1166
Language:English
Arabic
Language (ISO639):eng
ara
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-E0040
DateStamp:  2012-03-28
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2012. ELRA (European Language Resources Association).
Terms: area_Europe country_GB dcmi_Text iso639_ara iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-E0040
Up-to-date as of: Mon Feb 27 0:31:42 EST 2017