OLAC Record
oai:www.ldc.upenn.edu:LDC2018T13

Metadata
Title:TRAD Arabic-French Parallel Text -- Newsgroup
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Linguistic Data Consortium, and ELDA. TRAD Arabic-French Parallel Text -- Newsgroup LDC2018T13. Web Download. Philadelphia: Linguistic Data Consortium, 2018
Contributor:Linguistic Data Consortium
ELDA
Date (W3CDTF):2018
Date Issued (W3CDTF):2018-04-16
Description:*Introduction* TRAD Arabic-French Parallel Text -- Newsgroup was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 10,000 Arabic words from GALE Phase 1 Arabic Newsgroup Parallel Text - Part 1 (LDC2009T03). The PEA-TRAD project (Translation as a Support for Document Analysis) was supported by the French Ministry of Defense (DGA). Its purpose was to develop speech-to-speech translation technology for multiple languages (e.g., Arabic, Chinese, Pashto) from a variety of domains. ELDA developed several corpora for this effort. The Linguistic Data Consortium (LDC) has also released the following TRAD corpora: * TRAD Chinese-French Parallel Text -- Blog (LDC2018T02) * TRAD Chinese-French Parallel Text -- Broadcast News (LDC2018T17) * TRAD Arabic-French Parallel Text -- Newswire (LDC2018T21) *Data* This release consists of 398 segments (translation units) from 17 documents. The source data is Arabic newsgroup text collected and translated into English by the Linguistic Data Consortium for the DARPA GALE (Global Autonomous Language Exploitation) program. Information about the ELDA translation team, translation guidelines and validation results is contained in the documentation accompanying this release. The Arabic source file contains 10,706 words and the French reference translation contains 15,843 words. The data is presented in two unicode-encoded XML files along with an associated DTD. *Samples* Please view this source sample and reference sample. *Updates* None at this time.
Extent:Corpus size: 1136 KB
Identifier:LDC2018T13
https://catalog.ldc.upenn.edu/LDC2018T13
ISBN: 1-58563-841-2
ISLRN: 582-339-053-329-9
Language:Arabic
Standard Arabic
French
Language (ISO639):ara
arb
fra
License:TRAD Arabic-French Parallel Text – Newsgroup Agreement (For-Profit): https://catalog.ldc.upenn.edu/license/trad-arabic-french-parallel-text-newsgroup-agreement-for-profit.pdf
TRAD Arabic-French Parallel Text – Newsgroup Agreement (Not-For-Profit): https://catalog.ldc.upenn.edu/license/trad-arabic-french-parallel-text-newsgroup-agreement-not-for-profit.pdf
TRAD Arabic-French Parallel Text – Newsgroup Agreement (Non-Member): https://catalog.ldc.upenn.edu/license/trad-arabic-french-parallel-text-newsgroup-agreement-non-member.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2018T13
Rights Holder:Portions © 2018 ELDA, © 2005-2007, 2009, 2018 Trustees of the University of Pennsylvania
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2018T13
DateStamp:  2019-01-03
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Linguistic Data Consortium; ELDA. 2018. Linguistic Data Consortium.
Terms: area_Asia area_Europe country_FR country_SA dcmi_Text iso639_ara iso639_arb iso639_fra olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2018T13
Up-to-date as of: Sun Sep 1 18:19:42 EDT 2019