Resources in and about the Standard Arabic language


ISO 639-3: arb

The combined catalog of all OLAC participants contains the following resources that are relevant to this language:

Other known names and dialect names: Al-'Arabiyya, Al-Fusha, Arabic, Standard, Classical Arabic, Koranic Arabic, Literary Arabic, Modern Literary Arabic, Modern Standard Arabic, Quranic Arabic

Primary texts

  1. Arabic. Laycock, Don (recorder). 1960. Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC). oai:paradisec.org.au:DL1-038
  2. Arabic, Standard Genesis Translation. The Long Now Foundation. 1992. The Rosetta Project: A Long Now Foundation Library of Human Language. oai:rosettaproject.org:rosettaproject_arb_gen-1
  3. Arabic Newswire Part 1. David Graff and Kevin Walker. 2001. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2001T55
  4. Arabic Treebank: Part 1 v 2.0. Mohamed Maamouri, Ann Bies, Hubert Jin, and Tim Buckwalter. 2003. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2003T06
  5. Arabic Treebank: Part 1 - 10K-word English Translation. Moussa Bamba, Mohamed Maamouri, Hubert Jin, Ann Bies, and Xiaoyi Ma. 2003. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2003T07
  6. Arabic Gigaword. David Graff. 2003. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2003T12
  7. Multiple-Translation Arabic (MTA) Part 1. Kevin Walker, Moussa Bamba, David Miller, Xiaoyi Ma, Chris Cieri, and George Doddington. 2003. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2003T18
  8. Arabic Treebank: Part 2 v 2.0. Mohamed Maamouri, Ann Bies, Tim Buckwalter, and Hubert Jin. 2004. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2004T02
  9. TIDES Extraction (ACE) 2003 Multilingual Training Data. Alexis Mitchell, Stephanie Strassel, Mark Przybocki, JK Davis, George Doddington, Ralph Grishman, Adam Meyers, Ada Brunstein, Lisa Ferro, and Beth Sundheim. 2004. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2004T09
  10. Arabic Treebank: Part 3 v 1.0. Mohamed Maamouri, Ann Bies, Tim Buckwalter, and Hubert Jin. 2004. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2004T11
  11. Arabic News Translation Text Part 1. Xiaoyi Ma, Dalal Zakhary, and Moussa Bamba. 2004. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2004T17
  12. Arabic English Parallel News Part 1. Several. 2004. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2004T18
  13. Prague Arabic Dependency Treebank 1.0. Jan Hajic, Otakar Smrz, Petr Zemanek, Petr Pajas, Jan Snaidauf, Emanuel Beska, Jakub Kracmar, and Kamila Hassanova. 2004. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2004T23
  14. TDT4 Multilingual Broadcast News Speech Corpus. Junbo Kong and David Graff. 2005. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2005S11
  15. Arabic Treebank: Part 1 v 3.0 (POS with full vocalization + syntactic analysis). Mohamed Maamouri (project head), Ann Bies, Tim Buckwalter, and Hubert Jin. 2005. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2005T02
  16. Multiple-Translation Arabic (MTA) Part 2. Xiaoyi Ma. 2005. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2005T05
  17. ACE 2004 Multilingual Training Corpus. Alexis Mitchell, Stephanie Strassel, Shudong Huang, and Ramez Zakhary. 2005. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2005T09
  18. TDT4 Multilingual Text and Annotations. Stephanie Strassel, Junbo Kong, and David Graff. 2005. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2005T16
  19. Arabic Treebank: Part 3 (full corpus) v 2.0 (MPG + Syntactic Analysis). Mohamed Maamouri (project head), Ann Bies, Tim Buckwalter, Hubert Jin, and Wigdan Mekki. 2005. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2005T20
  20. Arabic Treebank: Part 4 v 1.0 (MPG Annotation). Mohamed Maamouri (Project head), Ann Bies, Tim Buckwalter, Hubert Jin, and Wigdan Mekki. 2005. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2005T30
  21. Arabic Broadcast News Speech. Mohamed Maamouri, David Graff, Christopher Cieri,. 2006. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2006S46
  22. Arabic Gigaword Second Edition. David Graff, Ke Chen, Junbo Kong, and Kazuaki Maeda. 2006. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2006T02
  23. ACE 2005 Multilingual Training Corpus. Christopher Walker, Stephanie Strassel, Julie Medero, and Kazuaki Maeda. 2006. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2006T06
  24. TDT5 Multilingual Text. David Graff, Junbo Kong, Kazuaki Maeda, Stephanie Strassel. 2006. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2006T18
  25. TDT5 Topics and Annotations. Meghan Glenn, Stephanie Strassel, Junbo Kong, Kazuaki Maeda. 2006. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2006T19
  26. Arabic Broadcast News Transcripts. Mohamed Maamouri, David Graff, Christopher Cieri. 2006. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2006T20
  27. 2003 NIST Rich Transcription Evaluation Data. Jonathan Fiscus, George Doddington, Audrey Le, Greg Sanders, Mark Przybocki, David Pallett. 2007. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2007S10
  28. GALE Phase 1 Distillation Training. Olga Babko-Malaya, Zhiyi Song, Ramez Zakhary, Julie Medero, Kazuaki Maeda, Stephanie Strassel. 2007. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2007T20
  29. GALE Phase 1 Arabic Broadcast News Parallel Text - Part 1. Xiaoyi Ma, Dalal Zakhary. 2007. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2007T24
  30. Arabic Gigaword Third Edition. David Graff. 2007. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2007T40
  31. TRECVID 2005 Keyframes & Transcripts. Peter Wilkins, Christian Petersohn, Kevin Walker. 2007. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2007V01
  32. GALE Phase 1 Arabic Blog Parallel Text. Xiaoyi Ma, Dalal Zakhary. 2008. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2008T02
  33. GALE Phase 1 Arabic Broadcast News Parallel Text - Part 2. Xiaoyi Ma, Dalal Zakhary. 2008. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2008T09
  34. 2007 NIST Language Recognition Evaluation
    Test Set.
    Alvin Martin, Audrey Le. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009S04
  35. GALE Phase 1 Arabic Newsgroup Parallel Text - Part 1. Xiaoyi Ma, Dalal Zakhary and Stephanie Strassel. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T03
  36. 2008 NIST Metrics for Machine Translation (MetricsMATR08) Development Data. Mark Przybocki, Kay Peterson, Sébastien Bronsart. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T05
  37. Unified Linguistic Annotation Text Collection. Linguistic Data Consortium. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T07
  38. Language Understanding Annotation Corpus. Mona Diab, Bonnie Dorr, Lori Levin, Teruko Mitamura, Rebecca Passonneau, Owen Rambow, Lance Ramshaw. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T10
  39. REFLEX Entity Translation Training/DevTest. Christopher Walker, Zhiyi Song, Stephanie Strassel, Julie Medero and Kazuaki Maeda. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T11
  40. Arabic Newswire English Translation Collection. Xiaoyi Ma, Dalal Zakhary. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T22
  41. Arabic Gigaword Fourth Edition. Robert Parker, David Graff, Ke Chen, Junbo Kong, and Kazuaki Maeda. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T30
  42. NIST Open Machine Translation 2008 Evaluation (MT08) Selected Reference and System Translations. NIST Multimodal Information Group. 2010. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2010T01
  43. Arabic Treebank: Part 3 v 3.2. Mohamed Maamouri, Ann Bies, Seth Kulick, Sondos Krouna, Fatma Gaddeche, Wajdi Zaghouani. 2010. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2010T08
  44. NIST 2002 Open Machine Translation (OpenMT) Evaluation. NIST Multimodal Information Group. 2010. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2010T10
  45. NIST 2003 Open Machine Translation (OpenMT) Evaluation. NIST Multimodal Information Group. 2010. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2010T11
  46. NIST 2004 Open Machine Translation (OpenMT) Evaluation. NIST Multimodal Inormation Group. 2010. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2010T12

Lexical resources

  1. Buckwalter Arabic Morphological Analyzer Version 1.0. Tim Buckwalter. 2002. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2002L49
  2. Buckwalter Arabic Morphological Analyzer Version 2.0. Tim Buckwalter. 2004. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2004L02
  3. LDC Standard Arabic Morphological Analyzer (SAMA) Version 3.1. Mohamed Maamouri, Dave Graff, Basma Bouziri, Sondos Krouna, Ann Bies, Seth Kulick. 2010. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2010L01

Language descriptions

  1. Standard ARABIC - statements. Amina CHENTIR. 2009. Laboratoire parole et langage (LPL, Aix-en-Provence FR). oai:crdo.fr:crdo000745
  2. A Grammar of the Arabic Language, translated from the German of Caspari, and edited with numerous additions and corrections, 3rd Ed.. Wright, W. 1874. Beirut: Librairie du Liban. oai:rosettaproject.org:rosettaproject_arb_morsyn-1
  3. Arabic, Standard Orthography. The Long Now Foundation. n.d. The Rosetta Project: A Long Now Foundation Library of Human Language. oai:rosettaproject.org:rosettaproject_arb_ortho-2
  4. Arabic, Standard Orthography. The Long Now Foundation. n.d. The Rosetta Project: A Long Now Foundation Library of Human Language. oai:rosettaproject.org:rosettaproject_arb_ortho-3
  5. The World's Writing Systems. Daniels, Peter T.; Bright, William. 1996. New York: Oxford University Press. oai:rosettaproject.org:rosettaproject_arb_ortho-4
  6. WALS Online Resources for Arabic (Modern Standard). Haspelmath, Martin (editor); Dryer, Matthew S. (editor); Gil, David (editor); Comrie, Bernard (editor). 2008-05-01. Max Planck Digital Library (http://mpdl.mpg.de/). oai:wals.info:languoid/ams

Other resources about the language

  1. Arabic, Standard: a language of Saudi Arabia. Lewis, M. Paul (editor). 2009. SIL International (www.sil.org). oai:ethnologue.com:arb
  2. English-Arabic Treebank v 1.0. Ann Bies. 2006. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2006T10
  3. ISI Arabic-English Automatically Extracted Parallel Text. Dragos Stefan Munteanu, Daniel Marcu. 2007. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2007T08
  4. OntoNotes Release 3.0. Ralph Weischedel, Sameer Pradhan, Lance Ramshaw, Kaufman, Michelle Franchini, Mohammed El-Bachouti, Nianwen Xue, Martha Palmer, Mitchell Marcus, Ann Taylor, Craig Greenberg, Eduard Hovy, Robert Belvin, Ann Houston. 2009. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2009T24
  5. NIST 2005 Open Machine Translation (OpenMT) Evaluation. NIST Multimodal Information Group. 2010. The LDC Corpus Catalog. oai:www.ldc.upenn.edu:LDC2010T14
  6. An Introduction to Modern Literary Arabic. Cowan, David. 1958. University Press. oai:refdb.wals.info:2245
  7. A New Arabic Grammar of the Written Language. Haywood, J. A.; Nahmad, H. M. 1965. Lund Humphries. oai:refdb.wals.info:2400
  8. The structure of Arabic: from sound to sentence. Nasr, Raja T. 1967. Librairie du Liban. oai:refdb.wals.info:5189
  9. Universal Declaration of Human Rights. The Long Now Foundation. 1948. The Rosetta Project: A Long Now Foundation Library of Human Language. oai:rosettaproject.org:rosettaproject_arb_undec-1
  10. Surrey Syncretisms Database. Baerman, Matthew; Brown, Dunstan; Corbett, Greville. 2002. University of Surrey. oai:surrey.smg.surrey.ac.uk:syncretism
  11. LINGUIST List Resources for Arabic, Standard. Anthony Aristar, Director of Linguist List (editor); Helen Aristar-Dry, Director of Linguist List (editor). 2010-09-02. The LINGUIST List (www.linguistlist.org). oai:linguistlist.org:lang_arb

Other search terms: dialect, vernacular, discourse, documentation, lexicon, dictionary, vocabulary, wordlist, phrase book, grammar, syntax, morphology, phonology, orthography