OLAC Record
oai:catalogue.elra.info:ELRA-W0008-01

Metadata
Title:MTP Annotated German corpus - untagged version
Abstract:A 500,000 German words corpus of SGML-formatted texts from two German newspapers, the Frankfurter Allgemeine Zeitung and Die Zeit, for the years 1990 to 1992.
Access Rights:Rights available for: Research Use, Commercial Use
Date Available (W3CDTF):1996-09-01
Date Issued (W3CDTF):2004-09-14
Date Modified (W3CDTF):2004-05-12
Description:Written Corpora
This morphosyntactically annotated 500,000 word German corpus was developed as part of the M?nster Tagging Project (MTP). It comprises a collection of SGML-formatted texts from two German newspapers, "Die Frankfurter Allgemeine Zeitung" and "Die Zeit", for the years 1990 to 1992. The articles reflect the typical distribution of newspaper topics, including economics, regional, national and international politics, the arts, sport, literature, history, science and modern life. The text was segmented into sentence units and word tokens, and tagged for morphosyntactic POS markers. Two tagsets, which mainly differed in the granularity of the noun and verb tags, and which comprised 137 and 52 tags respectively, were used. Users may obtain annotated versions using either set, each of which comes with documentation and an instruction manual for tag application. A suite of tools, including the MTP taggers and the Xlex workbench for text handling, textual analysis and lexicography, is also available.
Identifier:ELRA-W0008-01
http://catalog.elra.info/product_info.php?products_id=47
Language:German
Language (ISO639):deu
Medium:CD-ROM
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0008-01
DateStamp:  1996-09-01
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2004. ELRA (European Language Resources Association).
Terms: area_Europe country_DE dcmi_Text iso639_deu olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0008-01
Up-to-date as of: Wed Mar 29 3:47:37 EDT 2017