OLAC Record

Title:DEFT'08 Evaluation Package
Access Rights: Rights available for: evaluationUse
Date Available (W3CDTF):2012-03-28
Date Issued (W3CDTF):2012-03-28
Date Modified (W3CDTF):2012-03-28
Description:DEFT (DEfi Fouille de Texte – Text Mining Challenge) organizes evaluation campaigns in the field of text mining. The topic of DEFT 2008 edition is related to the classification of texts by topics and genres.Automatic classification has multiple applications in text mining. Many application fields have been explored, from email orientation to strategic or scientific watch. For a few years, a new problematics on text genre classification has emerged. Beyond document topic recognition, genre recognition is useful to the use that will be made out of the document. Questions that can be raised are: How can we recognize both document topic and genre? Can difference in genre influence the recognition of a document topical category, and conversely, can difference in topic influence the recognition of a document genre?To evaluate classification software for that prospect, DEFT’08 Evaluation Package enables to compare two corpora with different genres (a newspaper article corpus extracted from Le Monde newspaper and a corpus of encyclopaedic articles extracted from the internet free encyclopaedia, Wikipedia) on the basis of the same set of pre-defined categories. Although a newspaper article highlights news whereas an encyclopaedic article disseminates knowledge, both have a certain amount of general topical categories in common, called “column” for the former and “category” for the latter. It consists in testing, on the one hand, robustness of a topical classification model subjected to variations in text genre, and, on the other hand, possible improvements of a topical classification through the recognition of text genre, on those corpora.
ISLRN: 161-881-080-899-5
Identifier (URI):http://catalog.elra.info/en-us/repository/browse/ELRA-E0035/
Language (ISO639):fra
Medium:Not specified
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-E0035
DateStamp:  2012-03-28
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2012. ELRA (European Language Resources Association).
Terms: area_Europe country_FR dcmi_Text iso639_fra olac_primary_text

Up-to-date as of: Wed Nov 17 9:14:03 EST 2021