OLAC Record
oai:www.ldc.upenn.edu:LDC2008S02

Metadata
Title:CSLU: National Cellular Telephone Speech Release 2.3
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Cole, Ronald Allan, et al. CSLU: National Cellular Telephone Speech Release 2.3 LDC2008S02. Web Download. Philadelphia: Linguistic Data Consortium, 2008
Contributor:Cole, Ronald Allan
Noel, M
Lander, T.
Durham, T
Date (W3CDTF):2008
Date Issued (W3CDTF):2008-03-19
Description:*Introduction* This file contains documentation for CSLU: Nattional Cellular Telephone Speech Release 2.3, Linguistic Data Consortium (LDC) catalog number LDC2008S02 and isbn 1-58563-467-0. CSLU: National Cellular Telephone Speech Release 2.3 was created by the Center for Spoken Language Understanding (CSLU) at OGI School of Science and Engineering, Oregon Health and Science University, Beaverton, Oregon. It consists of cellular telephone speech and corresponding transcripts, specifically, approximately one minute of speech from 2336 speakers calling from locations throughout the United States. The data collection protocol used for this release is the same protocol used in CSLU: Portland Cellular Telephone Speech Version 1.3 (LDC2008S01). Speakers called the CSLU data collection system on cellular telephones, and they were asked a series of questions. Two prompt protocols were used: an In Vehicle Protocol for speakers calling from inside a vehicle and a Not in Vehicle Protocol for those calling from outside a vehicle. The protocols shared several questions, but each protocol contained distinct queries designed to probe the conditions of the caller's in vehicle/not in vehicle surroundings. *Recording Details* The data were collected with the CSLU T1 digital data collection system. The sampling rate was 8khz, and the files were stored in 8 bit mu-law format on a UNIX file system. In this release, the files are provided in 16-bit linearly encoded Windows wav (riff) format. *Transcription* The text transcriptions in this corpus were produced using the non time-aligned word-level conventions described in The CSLU Labeling Guide, which is included in the documentation for this release. CSLU: National Cellular Telephone Speech Release 2.3 contains orthographic and phonetic transcriptions of corresponding speech files. Non time-aligned orthographic transcriptions provide quick access to the content of an utterance; they may contain markers for word boundaries to support access and retrieval at the lexical level. Phonetic/phonemic transcriptions represent the phonetic content of an utterance at a given level of detail that is made explicit by the use of diacritics. Phonetic phenomena transcribed includes excessive nasalization, glottalization, frication on a stop, centralization, lateralization, rounding and palatalization. *Samples* For an example of the data in this corpus, please listen to the following audio samples: * speaker 1 * speaker 2
Extent:Corpus size: 3565158 KB
Format:Sampling Rate: 8000
Sampling Format: ulaw
Identifier:LDC2008S02
https://catalog.ldc.upenn.edu/LDC2008S02
ISBN: 1-58563-467-0
ISLRN: 571-537-588-741-2
DOI: 10.35111/wj80-ka72
Language:English
Language (ISO639):eng
License:CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2008S02
Rights Holder:Portions © 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2008 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2008S02
DateStamp:  2022-01-20
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Cole, Ronald Allan; Noel, M; Lander, T.; Durham, T. 2008. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2008S02
Up-to-date as of: Mon Mar 25 7:20:17 EDT 2024