OLAC Record
oai:catalogue.elra.info:ELRA-S0228-16

Metadata
Title:Mandarin Chinese Desktop Speech Recognition Corpus - Digit String (120 people)
Abstract:This corpus comprises 1,500 entries uttered by 120 speakers of different dialects, ages and various educational levels (59 males and 61 females), recorded through head-mounted noise-canceling microphone. The database comprises 3,600 digit strings. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 6.2 hours of speech. The total capacity of the data is 945 Mb. Each speaker read 120-150 items. Text files are stored in Unicode format. All data have been proofread manually. The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates. The corpus aims to be applied to the testing and telephone natural speech recognition system.
Access Rights:Rights available for: Research Use, Commercial Use
Date Available (W3CDTF):2007-01-17
Date Issued (W3CDTF):2006-12-20
Date Modified (W3CDTF):2009-09-24
Description:Desktop/Microphone
This corpus comprises 1,500 entries uttered by 120 speakers of different dialects, ages and various educational levels (59 males and 61 females), recorded through head-mounted noise-canceling microphone. The database comprises 3,600 digit strings. Speech samples are stored as a sequence of 16-bit 22.05kHz WAV for a total of 6.2 hours of speech. The total capacity of the data is 945 Mb. Each speaker read 120-150 items. Text files are stored in Unicode format. All data have been proofread manually. The transcriptions include non-speech markers (background noise, background speech, speaker sounds) as well as markers for mispronunciation, channel distortions, words left-out and duplicates. The corpus aims to be applied to the testing and telephone natural speech recognition system.
Identifier:ELRA-S0228-16
http://catalog.elra.info/product_info.php?products_id=907
Language:Chinese
Language (ISO639):zho
Publisher:ELRA (European Language Resources Association)
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  ELRA Catalogue of Language Resources
Description:  http://www.language-archives.org/archive/catalogue.elra.info
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:catalogue.elra.info:ELRA-S0228-16
DateStamp:  2007-01-17
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: n.a. 2006. ELRA (European Language Resources Association).
Terms: dcmi_Sound iso639_zho olac_primary_text


http://www.language-archives.org/item.php/oai:catalogue.elra.info:ELRA-S0228-16
Up-to-date as of: Fri Jun 23 1:05:37 EDT 2017