OLAC Record
oai:www.ldc.upenn.edu:LDC2009S01

Metadata
Title:CSLU: Numbers Version 1.3
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Cole, Ronald Allan, et al. CSLU: Numbers Version 1.3 LDC2009S01. Web Download. Philadelphia: Linguistic Data Consortium, 2009
Contributor:Cole, Ronald Allan
Noel, M
Lander, T.
Durham, T
Date (W3CDTF):2009
Date Issued (W3CDTF):2009-01-16
Description:*Introduction:* CSLU: Numbers Version 1.3, Linguistic Data Consortium (LDC) catalog number LDC2009S01 and isbn 1-58563-501-4, was created by the Center for Spoken Language Understanding (CSLU) at OGI School of Science and Engineering, Oregon Health and Science University, Beaverton, Oregon. It is a collection of naturally produced numbers taken from utterances in various CSLU telephone speech data collections. The corpus consists of approximately fifteen hours of speech and includes isolated digit strings, continuous digit strings, and ordinal/cardinal numbers. The numbers have several sources, among them, phone numbers, numbers from street addresses and zip codes, uttered by 12618 speakers in a total of 23902 files. In most of CSLU's telephone data collections, callers were asked for their phone number, birthdate or zip code. Callers would also occasionally leave numbers in the midst of another utterance. The numbers in those situations were extracted from the host utterance and added to the corpus. Additional information about this publication is available from the corpus web page at CSLU. * Data:* The speech data was collected over analog and digital telephone lines. The analog data was recorded using a Gradient Technologies analog-to-digital conversion box; those files were recorded as 16-bit, 8 khz and stored in a linear format. The digital data was recorded with the CSLU T1 digital data collection system; those files were sampled at 8khz, 8-bit and stored as ulaw files. All of the data in this release has been linearly encoded in 16-bit RIFF standard file format. Each file includes an orthographic transcription following the CSLU Labeling guidelines which are included in the documentation for this publication. Also, many of the utterances have been phonetically labeled. * Statistics: * CSLU: Numbers Version 1.3 consists of approximately fifteen hours of speech. The following table gives a count of the number of files for each utterance type. Type Number phone 2970 street 7079 zipcode 7076 other 6771 *Samples:* For an example of the data contained in this corpus, please examine the audio files and labels for the following spoken sequences * Street Address: one sixteen wav|label * Zipcode: one oh three one four wav|label
Extent:Corpus size: 913303 KB
Format:Sampling Rate: 8000
Sampling Format: Signed 16 bit PCM,1 Channel
Identifier:LDC2009S01
https://catalog.ldc.upenn.edu/LDC2009S01
ISBN: 1-58563-501-4
ISLRN: 144-817-035-468-1
DOI: 10.35111/88h0-wp09
Language:English
Language (ISO639):eng
License:CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2009S01
Rights Holder: Portions © 1998, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2009 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2009S01
DateStamp:  2022-01-20
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Cole, Ronald Allan; Noel, M; Lander, T.; Durham, T. 2009. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2009S01
Up-to-date as of: Mon Mar 25 7:20:20 EDT 2024