OLAC Record
oai:catalogue.elra.info:ELRA-W0058

Metadata
Title:PANACEA English-French and English-Greek parallel corpus acquired for Labour Legislation domain
Abstract:This package consists of an English-French and English-Greek sentence-aligned parallel corpus from the Labour Legislation domain automatically acquired from the web during 2010 and 2011. It was acquired in the framework of the PANACEA project. Data and language pairs are split into training, test and development test sets.
Access Rights:Rights available for: Research Use
Date Available (W3CDTF):2013-01-30
Date Issued (W3CDTF):2012-10-31
Date Modified (W3CDTF):2013-01-30
Description:Written Corpora
The PANACEA English-French and English-Greek parallel corpus was acquired in the framework of the PANACEA project (Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies), under the European Commission's Seventh Framework Programme. This package consists of an English-French and English-Greek sentence-aligned parallel corpus from the Labour Legislation domain automatically acquired from the web during 2010 and 2011. Data and language pairs are split into training, test and development test sets as follows:
filenamesentencestokensvocabulary
lab.en-el.dev.el506160893719
lab.en-el.dev.en506151292705
lab.en-el.test.el2000667708014
lab.en-el.test.en2000629535145
lab.en-el.train.el706424439617250
lab.en-el.train.en706423314510249
lab.en-fr.dev.en1411521565775
lab.en-fr.dev.fr1411611916429
lab.en-fr.test.en2000716886984
lab.en-fr.test.fr2000843997833
lab.en-fr.train.en2026170994319925
lab.en-fr.train.fr2026183668422349
All corpus files are provided as plain text in UTF8 character encoding, one sentence per line with line numbers identifying parallel sentences.
Identifier:ELRA-W0058
http://catalog.elra.info/product_info.php?products_id=1183
Language:English
Modern Greek (1453-); Greek, Modern (1453-)
Language (ISO639):eng
ell
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0058
DateStamp:  2013-01-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2012. ELRA (European Language Resources Association).
Terms: area_Europe country_GB country_GR dcmi_Text iso639_ell iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0058
Up-to-date as of: Fri Jun 23 1:06:23 EDT 2017