OLAC Record
oai:www.ldc.upenn.edu:LDC2007S05

Metadata
Title:CSLU: Yes/No Version 1.2
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Noel, Mike. CSLU: Yes/No Version 1.2 LDC2007S05. Web Download. Philadelphia: Linguistic Data Consortium, 2007
Contributor:Noel, Mike
Date (W3CDTF):2007
Date Issued (W3CDTF):2007-07-17
Description:*Introduction* This file contains documentation for CSLU:Yes/No Version 1.2, Linguistic Data Consortium (LDC) catalog number LDC2007S05 and isbn 1-58563-445-X. CSLU: Yes/No Version 1.2 is a collection of answers to yes/no questions from various telephone speech corpora created by the Center for Spoken Language Understanding, Oregon Health and Science University (CSLU). The corpus contains approximately 20,000 examples of roughly 18,000 speakers saying "yes" or "no" in response to various questions. Each speech file in the corpus has a corresopnding orthographic transcription following the CSLU Labeling Conventions. In cases where a transcription did not already exist, the utterance was run through a speech recognizer to automatically obtain the transcription. The data were collected from both analog and digital phone lines. The analog data were recorded using a Gradient Technologies analog-to-digital conversion box. These files were recorded as 16-bit, 8 khz and stored in a linear format. The digital data were recorded with the CSLU T1 digital data collection system. These files were sampled at 8 khz 8-bit and stored as ulaw files. All of the data use the RIFF standard file format. This file format is 16-bit linearly encoded. *Samples* For a sample of the audio in this corpus, please listen to this sample .
Extent:Corpus size: 591872 KB
Format:Sampling Rate: 8000
Sampling Format: pcm
Identifier:LDC2007S05
https://catalog.ldc.upenn.edu/LDC2007S05
ISBN: 1-58563-445-X
ISLRN: 910-955-859-747-8
DOI: 10.35111/18ns-0a21
Language:English
Language (ISO639):eng
License:CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2007S05
Rights Holder:Portions © 1996, 1998, 2000, 2002 Center for Spoken Language Understanding, Oregon Health and Science University, © 2007 Trustees of the University of Pennsylvania
Subject (OLAC):computational_linguistics
Type (DCMI):Sound
Type (Discourse):formulaic
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2007S05
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Noel, Mike. 2007. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_computational_linguistics olac_formulaic olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2007S05
Up-to-date as of: Mon Mar 25 7:20:07 EDT 2024