OLAC Record

Title:PAROLE English lexicon
Abstract:The PAROLE English lexicon consists of 22 000 morphological units extracted from the CRL-LKB and COBUILD dictionaries: 12998 are common nouns, 40 proper nouns, 4195 verbs, 3208 adjectives, 606 adverbs, 71 adpositions, 2 articles, 21 conjunctions, 25 determiners and 53 pronouns.
Access Rights:Rights available for: Research Use, Commercial Use
Date Available (W3CDTF):2001-04-06
Date Issued (W3CDTF):2004-09-14
Date Modified (W3CDTF):2007-02-22
Description:Monolingual Lexicons
The English PAROLE Lexicon has been compiled by two partners, Sheffield University and the Corpus Linguistic Group (CLG) at Birmingham University. The Lexicon was compiled from existing resources: CRL-LKB and the COBUILD dictionary database. Both have restricted availability and contain extensive syntactic, semantic and morphological information. The lexicon contains 22,000 morphological units, of which 12998 are common nouns, 40 proper nouns 4195 verbs, 3208 adjectives, 606 adverbs, 71 adpositions, 2 articles, 21 conjunctions, 25 determiners, 53 pronouns. The English PAROLE lexicon comprises the following information: - morphological encoding for all nouns, verbs, adverbs, adjectives and functions words; - syntactic encoding of all verbs, nouns, adjectives and adverbs. The organizational procedure was as follows: 1. Selection: Lemmata were mostly selected on the basis of frequency from the COBUILD corpus. Most proper nouns were deselected and some verbs were added because of the decision to encode deverbal nominalisations and compound information. 2. Coverage: the headword list was checked against the resources to make sure there was adequate coverage of syntactic and morphological information. 3. Composition: the nominal lemmata were checked for derivations and compounds. These were extracted and analyzed into their constituent parts and compounds were checked for lexicalisation. Components were flagged with their base forms and grammatical class. 4. Conversion: Morphosyntactic information was either directly transferred from existing resources or, in the case of inflectional information and subcategorisation patterns, programs were written to extract information and convert it into the PAROLE format. 5. Cross-reference: all components contained in nominal derivations and compounds were cross-referenced with their base PoS. Integrity checks were made and the lexicon was parsed using nsgmls.
Language (ISO639):eng
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-L0043
DateStamp:  2001-04-06
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2004. ELRA (European Language Resources Association).
Terms: area_Europe country_GB dcmi_Text iso639_eng olac_primary_text

Up-to-date as of: Sun Nov 12 1:43:47 EST 2017