OLAC Record
oai:www.ldc.upenn.edu:LDC2017L01

Metadata
Title:Arabic Speech Recognition Pronunciation Dictionary
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Ali, Ahmed. Arabic Speech Recognition Pronunciation Dictionary LDC2017L01. Web Download. Philadelphia: Linguistic Data Consortium, 2017
Contributor:Ali, Ahmed
Date (W3CDTF):2017
Date Issued (W3CDTF):2017-01-19
Description:*Introduction* Arabic Speech Recognition Pronunciation Dictionary was developed by the Qatar Computing Research Institute. It contains approximately two million pronunciation entries for 526,000 Modern Standard Arabic words, for an average of 3.84 pronunciations for each grapheme word. *Data* The dictionary was developed from news archive resources, including the Arabic news website Aljazeera.net. The selected words were those that occurred more than once in the news collection. The text was processed using MADA. The dictionary is presented in a single UTF-8 plain text file. *Samples* Please view this sample. *Updates* None at this time.
Extent:Corpus size: 53272 KB
Identifier:LDC2017L01
https://catalog.ldc.upenn.edu/LDC2017L01
ISBN: 1-58563-783-1
ISLRN: 445-866-322-325-6
DOI: 10.35111/9abp-k222
Language:Arabic
Standard Arabic
Language (ISO639):ara
arb
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2017L01
Rights Holder:Portions © 2017 Qatar Computing Research Institute, © 2017 Trustees of the University of Pennsylvania
Subject:Arabic language
Standard Arabic language
Subject (ISO639):ara
arb
Type (DCMI):Text
Type (OLAC):lexicon

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2017L01
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Ali, Ahmed. 2017. Linguistic Data Consortium.
Terms: area_Asia country_SA dcmi_Text iso639_ara iso639_arb olac_lexicon

Inferred Metadata

Country: Saudi Arabia
Area: Asia


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2017L01
Up-to-date as of: Mon Mar 25 7:20:52 EDT 2024