OLAC Record: ESTER Evaluation Package

OLAC Record
oai:catalogue.elra.info:ELRA-E0021

Metadata

Title: ESTER Evaluation Package

Access Rights: Rights available for: evaluationUse

Date Available (W3CDTF): 2007-06-28

Date Issued (W3CDTF): 2007-06-28

Date Modified (W3CDTF): 2012-03-29

Description: The ESTER Evaluation Package was produced within the French national project ESTER (Evaluation of Broadcast News enriched transcription systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ESTER project enabled to carry out a campaign for the evaluation of Broadcast News enriched transcription systems using French data. This project is an extension of the only campaign that was ever carried out for French in this field within the AUPELF campaigns (Actions de recherche Concertées, 1996-1999). This package includes the material that was used for the ESTER evaluation campaign. It includes resources, protocols, scoring tools, results of the campaign, etc., that were used or produced during the campaign. The aim of these evaluation packages is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself. The campaign is distributed over three actions: 1)Orthographic transcription: it consists in producing an orthographic transcription of radio-broadcast news, which quality is measured by word error rates. There are two distinct tasks, one with and one without calculation time constraint.2)Segmentation: the segmentation tasks consist of segmentation in sound events, speaker tracking and speaker segmentation. For the sound event segmentation, the task consists of tracking the parts which contain music (with or without speech) and the parts which contain speech (with or without music). The speaker tracking task consists in detecting the parts of the document that correspond to a given speaker. The speaker segmentation consists of segmenting the document in speakers and grouping the parts spoken by the same speaker.3)Information extraction: it consists of an exploratory task on named entity tracking. The objective was to set up and test an evaluation protocol instead of measure performances. The systems must detect eight classes of entities (person, place, data, organisation, geo-political entity, amount, building and unknown) from the automatic transcription or the manual transcription.The ESTER evaluation package contains the following data and tools:1)About 100 hours of orthographically transcribed news broadcast, including annotations of named entities. 2)The textual resources distributed within the ESTER campain are mainly based on the archives from Le Monde newspaper 1987-2003 (ELRA-W0015) and the debates from the European Parliament (ELRA-W0023).3)The evaluation tools allow to evaluation each task defined above.4)Two guides and manuals were produced and are provided in the package distributed by ELDA :oGuide for the annotation of named entitiesoSpecifications and evaluation protocolA description of the project is available at the following address:http://www.technolangue.net/article.php3?id_article=60 (in French language)An extra corpus of 1,700 hours of non-transcribed radio broadcast news recordings can also be provided upon request, on hard disk, as an adding to this package at a cost of 100 Euro (plus shipment fee).For research or commercial use, please refer to ELRA-S0241 ESTER Corpus.

Identifier: ELRA-E0021

ISLRN: 110-079-844-983-7

Identifier (URI): https://catalog.elra.info/en-us/repository/browse/ELRA-E0021/

Language: French

Language (ISO639): fra

Medium: Not specified

Publisher: ELRA (European Language Resources Association)

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: ELRA Catalogue of Language Resources

Description: http://www.language-archives.org/archive/catalogue.elra.info

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:catalogue.elra.info:ELRA-E0021

DateStamp: 2007-06-28

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: n.a. 2007. ELRA (European Language Resources Association).
Terms: area_Europe country_FR dcmi_Sound iso639_fra olac_primary_text

http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-E0021
Up-to-date as of: Wed Oct 1 0:55:49 EDT 2025

Metadata
Title:		ESTER Evaluation Package
Access Rights:		Rights available for: evaluationUse
Date Available (W3CDTF):		2007-06-28
Date Issued (W3CDTF):		2007-06-28
Date Modified (W3CDTF):		2012-03-29
Description:		The ESTER Evaluation Package was produced within the French national project ESTER (Evaluation of Broadcast News enriched transcription systems), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The ESTER project enabled to carry out a campaign for the evaluation of Broadcast News enriched transcription systems using French data. This project is an extension of the only campaign that was ever carried out for French in this field within the AUPELF campaigns (Actions de recherche Concertées, 1996-1999). This package includes the material that was used for the ESTER evaluation campaign. It includes resources, protocols, scoring tools, results of the campaign, etc., that were used or produced during the campaign. The aim of these evaluation packages is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself. The campaign is distributed over three actions: 1)Orthographic transcription: it consists in producing an orthographic transcription of radio-broadcast news, which quality is measured by word error rates. There are two distinct tasks, one with and one without calculation time constraint.2)Segmentation: the segmentation tasks consist of segmentation in sound events, speaker tracking and speaker segmentation. For the sound event segmentation, the task consists of tracking the parts which contain music (with or without speech) and the parts which contain speech (with or without music). The speaker tracking task consists in detecting the parts of the document that correspond to a given speaker. The speaker segmentation consists of segmenting the document in speakers and grouping the parts spoken by the same speaker.3)Information extraction: it consists of an exploratory task on named entity tracking. The objective was to set up and test an evaluation protocol instead of measure performances. The systems must detect eight classes of entities (person, place, data, organisation, geo-political entity, amount, building and unknown) from the automatic transcription or the manual transcription.The ESTER evaluation package contains the following data and tools:1)About 100 hours of orthographically transcribed news broadcast, including annotations of named entities. 2)The textual resources distributed within the ESTER campain are mainly based on the archives from Le Monde newspaper 1987-2003 (ELRA-W0015) and the debates from the European Parliament (ELRA-W0023).3)The evaluation tools allow to evaluation each task defined above.4)Two guides and manuals were produced and are provided in the package distributed by ELDA :oGuide for the annotation of named entitiesoSpecifications and evaluation protocolA description of the project is available at the following address:http://www.technolangue.net/article.php3?id_article=60 (in French language)An extra corpus of 1,700 hours of non-transcribed radio broadcast news recordings can also be provided upon request, on hard disk, as an adding to this package at a cost of 100 Euro (plus shipment fee).For research or commercial use, please refer to ELRA-S0241 ESTER Corpus.
Identifier:		ELRA-E0021
Identifier:		ISLRN: 110-079-844-983-7
Identifier (URI):		https://catalog.elra.info/en-us/repository/browse/ELRA-E0021/
Language:		French
Language (ISO639):		fra
Medium:		Not specified
Publisher:		ELRA (European Language Resources Association)
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		ELRA Catalogue of Language Resources
Description:		http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:catalogue.elra.info:ELRA-E0021
DateStamp:		2007-06-28
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		n.a. 2007. ELRA (European Language Resources Association).
Terms:		area_Europe country_FR dcmi_Sound iso639_fra olac_primary_text