OLAC Record
oai:www.ldc.upenn.edu:LDC2006S35

Metadata
Title:CSLU: Multilanguage Telephone Speech Version 1.2
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Muthusamy, Yeshwant, Ronald Cole, and Beatrice Oshika. CSLU: Multilanguage Telephone Speech Version 1.2 LDC2006S35. DVD. Philadelphia: Linguistic Data Consortium, 2006
Contributor:Muthusamy, Yeshwant
Cole, Ronald
Oshika, Beatrice
Date (W3CDTF):2006
Date Issued (W3CDTF):2006-06-15
Description:*Introduction* The Multilanguage Telephone Speech corpus consists of telephone speech from 11 languages: English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, Vietnamese. The corpus contains fixed vocabulary utterances (eg. days of the week) as well as fluent continuous speech. The current release includes recorded utterances from about 2,052 speakers, for a total of about 38.5 hours of speech. Time-aligned phonetic transcriptions for 619 of the utterances are also included. *Data* Each subject called the CSLU data collection system by dialing a toll-free number. An analog telephone line was connected to a Gradient Technologies box. Data from incoming calls were recorded by the Gradient box. The sampling rate was 8 khz and the files were stored in 16-bit linear format on a UNIX file system. Each utterance was recorded as a separate file. *Samples* For an example of the data in this corpus, please listen to these audio samples in Tamil and English.
Extent:Corpus size: 2202009 KB
Format:Sampling Rate: 8000
Sampling Format: pcm
Identifier:LDC2006S35
https://catalog.ldc.upenn.edu/LDC2006S35
ISBN: 1-58563-390-9
ISLRN: 871-936-811-171-7
Language:Vietnamese
Tamil
Spanish
Iranian Persian
Korean
Japanese
Hindi
French
English
German
Mandarin Chinese
Language (ISO639):vie
tam
spa
pes
kor
jpn
hin
fra
eng
deu
cmn
License:CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Medium:Distribution: DVD
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2006S35
Rights Holder:Portions © 1992, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2006 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2006S35
DateStamp:  2014-07-17
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Muthusamy, Yeshwant; Cole, Ronald; Oshika, Beatrice. 2006. Linguistic Data Consortium.
Terms: area_Asia area_Europe country_CN country_DE country_ES country_FR country_GB country_IN country_IR country_JP country_KR country_VN dcmi_Sound iso639_cmn iso639_deu iso639_eng iso639_fra iso639_hin iso639_jpn iso639_kor iso639_pes iso639_spa iso639_tam iso639_vie olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2006S35
Up-to-date as of: Tue Apr 25 1:31:10 EDT 2017