OLAC Record

Title:RATS Keyword Spotting
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Graff, David, et al. RATS Keyword Spotting LDC2017S20. Web Download. Philadelphia: Linguistic Data Consortium, 2017
Contributor:Graff, David
Ma, Xiaoyi
Strassel, Stephanie
Walker, Kevin
Jones, Karen
Date (W3CDTF):2017
Date Issued (W3CDTF):2017-10-18
Description:*Introduction* RATS Keyword Spotting was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 400 hours of Levantine Arabic and Farsi conversational telephone speech with automatic and manual annotation of speech segments, transcripts and keywords generated from transcript content. The audio was retransmitted over eight channels, making 3,100 hours of total audio. The corpus was created to provide training, development and initial test sets for the keyword spotting (KWS) task in the DARPA RATS (Robust Automatic Transcription of Speech) program. The goal of the RATS program was to develop human language technology systems capable of performing speech detection, language identification, speaker identification and keyword spotting on the severely degraded audio signals that are typical of various radio communication channels, especially those employing various types of handheld portable transceiver systems. To support that goal, LDC assembled a system for the transmission, reception and digital capture of audio data that allowed a single source audio signal to be distributed and recorded over eight distinct transceiver configurations simultaneously. Those configurations included three frequencies -- high, very high and ultra high -- variously combined with amplitude modulation, frequency hopping spread spectrum, narrow-band frequency modulation, single-side-band or wide-band frequency modulation. Annotations on the clear source audio signal, e.g., time boundaries for the duration of speech activity, were projected onto the corresponding eight channels recorded from the radio receivers. *Data* The source audio consists of conversational telephone speech recordings collected by LDC: (1) data collected for the RATS program from Levantine Arabic and Farsi speakers; and (2) material from Levantine Arabic QT Training Data Set 5, Speech (LDC2006S29) and CALLFRIEND Farsi Second Edition Speech (LDC2014S01). Annotation was performed in two steps. Transcripts of calls were either produced or already available from the source corpora. For the CALLFRIEND Farsi calls, transcripts were updated by native Farsi speakers. Potential target keywords were selected from the transcripts on the basis of overall word frequencies to fall within a given range of target-word likelihood per hour of speech. The selected words were then reviewed by native speakers to confirm that each selection was a regular word or multi-word expression of more than three syllables. All audio files are presented as single-channel, 16-bit PCM, 16000 samples per second; lossless FLAC compression is used on all files; when uncompressed, the files have typical "MS-WAV" (RIFF) file headers. The data is divided for use as training, initial development set, and initial evaluation set (note that the initial evaluation only used Levantine Arabic data). *Samples* Please view this audio sample and annotation sample. *Updates* None at this time. *Acknowledgment* This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. D10PC20016. The content does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.
Extent:Corpus size: 175528280 KB
Format:Sampling Rate: 16000
Sampling Format: pcm
ISBN: 1-58563-817-X
ISLRN: 834-222-629-362-9
DOI: 10.35111/wsz3-nh04
Language:South Levantine Arabic
North Levantine Arabic
Language (ISO639):ajp
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC2017S20
Rights Holder:Portions © 1995-1996, 2003-2006, 2014, 2017 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text


Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC2017S20
DateStamp:  2021-10-15
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Graff, David; Ma, Xiaoyi; Strassel, Stephanie; Walker, Kevin; Jones, Karen. 2017. Linguistic Data Consortium.
Terms: area_Asia country_JO country_SY dcmi_Sound dcmi_Text iso639_ajp iso639_apc iso639_fas olac_primary_text

Up-to-date as of: Sun Jun 16 7:34:46 EDT 2024