OLAC Record
oai:catalogue.elra.info:ELRA-W0015

Metadata
Title:Text corpus of "Le Monde"
Abstract:Corpus from "Le Monde" newspaper. Each year contains some 10 Mbytes of data per month (circa 120 Mbytes per year). Data ranging from 1987 until 2012 are available.
Access Rights:Rights available for: Research Use
Date Available (W3CDTF):1997-09-15
Date Issued (W3CDTF):2004-09-14
Date Modified (W3CDTF):2016-03-04
Description:Written Corpora
Electronic archiving of "Le Monde" articles started on 1 January 1987. Some 200 articles are added every day, and as of October 1997 the database contains more than 500,000 articles, making it the biggest of its kind for all French daily newspapers. Years 1987 to 2002 are available in an ASCII text format. Years 2003 to 2007 are available in .XML format. Each month consists of some 10 MB of data (circa 120 MB per year). The number of words available since 2005 is given below: - 2005: 19 million words - 2006: 17 million words - 2007: 21 million words Years 2008 to 2012 are also available, in an ASCII text format, with no markup. Data ranging from 1987 until 2012 are available through ELRA. The approx. number of articles available per year is as follows: - 1987: 39742 articles - 1988: 40190 articles - 1989: 39784 articles - 1990: 38680 articles - 1991: 39127 articles - 1992: 40661 articles - 1993: 42664 articles - 1994: 44013 articles - 1995: 47646 articles - 1996: 49557 articles - 1997: 63161 articles - 1998: 56431 articles - 1999: 59630 articles - 2000: 61977 articles - 2001: 61480 articles - 2002: 60148 articles - 2003: 48900 articles - 2004: 43448 articles - 2005: 40169 articles - 2006: 36142 articles - 2007: 44290 articles - 2008: 40075 articles - 2009: 39912 articles - 2010: 40816 articles - 2011: 40290 articles - 2012: 40210 articles TOTAL: 1,199,143 articles
Identifier:ELRA-W0015
http://catalog.elra.info/product_info.php?products_id=438
Language:French
Language (ISO639):fra
Medium:Downloadable
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-W0015
DateStamp:  1997-09-15
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2004. ELRA (European Language Resources Association).
Terms: area_Europe country_FR dcmi_Text iso639_fra olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-W0015
Up-to-date as of: Sun Jun 17 0:44:37 EDT 2018