OLAC Record

Title:Buckwalter Arabic Morphological Analyzer Version 2.0
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Buckwalter, Tim. Buckwalter Arabic Morphological Analyzer Version 2.0 LDC2004L02. Web Download. Philadelphia: Linguistic Data Consortium, 2004
Contributor:Buckwalter, Tim
Date (W3CDTF):2004
Date Issued (W3CDTF):2004-12-15
Description:*Introduction* This file contains documentation on the Buckwalter Arabic Morphological Analyzer Version 2.0. *Data* The data consists primarily of three Arabic-English lexicon files: prefixes (299 entries), suffixes (618 entries), and stems (82158 entries representing 38600 lemmas). The lexicons are supplemented by three morphological compatibility tables used for controlling prefix-stem combinations (1648 entries), stem-suffix combinations (1285 entries), and prefix-suffix combinations (598 entries). The actual code for morphology analysis and POS tagging is contained in a Perl script. The documentation consists of a readme file with a description of the lexicon files, the morphological compatibility tables, the morphology analysis algorithm, a summary of stem morphological categories, and a table with the authors Arabic transliteration system. *Samples* To see an example of the analyzers output, please examine this sample. *Additional Licensing Instructions* This 'members-only' corpus is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.
Extent:Corpus size: 9216 KB
ISBN: 1-58563-324-0
ISLRN: 694-194-540-336-4
DOI: 10.35111/050q-5r95
Language:Standard Arabic
Language (ISO639):arb
License:BAMA Agreement: https://catalog.ldc.upenn.edu/license/buckwalter-arabic-morphological-analyzer.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2004L02
Subject:Standard Arabic language
Subject (ISO639):arb
Type (DCMI):Sound
Type (OLAC):lexicon


Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2004L02
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Buckwalter, Tim. 2004. Linguistic Data Consortium.
Terms: area_Asia area_Europe country_GB country_SA dcmi_Sound iso639_arb iso639_eng olac_lexicon

Inferred Metadata

Country: Saudi Arabia
Area: Asia

Up-to-date as of: Sun May 8 9:21:07 EDT 2022