OLAC Record: RM Isolated and Spelled Word Data

OLAC Record
oai:www.ldc.upenn.edu:LDC96S39

Metadata

Title: RM Isolated and Spelled Word Data

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: NIST Multimodal Information Group. RM Isolated and Spelled Word Data LDC96S39. Web Download. Philadelphia: Linguistic Data Consortium, 1996

Contributor: NIST Multimodal Information Group

Date (W3CDTF): 1996

Description: *Introduction* This release contains previously unreleased isolated-word and spell-mode (spelled out words) speech data from the (D)ARPA Resource Management (RM1) Corpus. This data is based on a 600-word subset of the 991-word RM1 vocabulary and contains spoken and spelled words pertaining to the RM1 naval resource management task. This corpus was collected simultaneously as part of the RM1 Continuous Speech Corpus (NIST Speech Discs 2-1-2-4) and contains speech from the same sets of subjects used in RM1. *Data* The speech data has been segmented into separate spelled and spoken-word waveform files for each subject-word utterance. Time-aligned word and phonetic transcriptions have been generated automatically using forced recognition and are included. The time-aligned transcriptions employ the same format and phone set as the TIMIT Acoustic-Phonetic Continuous Speech Corpus (NIST Speech Disc 1-1). See the TIMIT CD-ROM companion booklet, NISTIR 4930, pp. 29-31, for a description of the phone set. As with the continuous speech portion of RM1, this data is subsetted into speaker-independent and speaker-dependent partitions. These data sets are further partioned into training, development-test and evaluation-test subsets. See the "readme.doc" file in the top-level directory for more information about the data. Texas Instruments recruited the subjects and collected the speech. The National Institute of Standards and Technology (NIST) segmented the waveforms, generated the time-aligned transcriptions and produced this release. *Updates* RM Isolated and Spelled Word Data is no longer available as catalog number LDC96S39; it has been incorporated into Resource Management RM1 2.0, and it is currently available in both Resource Management RM1 2.0 (LDC93S3B), and Resource Management Complete Set 2.0 (LDC93S3A).

Format: Sampling Rate: 16000

Sampling Format: 1-channel pcm

Identifier: LDC96S39

https://catalog.ldc.upenn.edu/LDC96S39

ISBN: 1-58563-106-X

ISLRN: 819-670-687-754-1

DOI: 10.35111/0bnw-zb85

Language: English

Language (ISO639): eng

License: LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC96S39

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC96S39

DateStamp: 2020-11-30

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: NIST Multimodal Information Group. 1996. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC96S39
Up-to-date as of: Wed Oct 29 7:00:37 EDT 2025

Metadata
Title:		RM Isolated and Spelled Word Data
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		NIST Multimodal Information Group. RM Isolated and Spelled Word Data LDC96S39. Web Download. Philadelphia: Linguistic Data Consortium, 1996
Contributor:		NIST Multimodal Information Group
Date (W3CDTF):		1996
Description:		Introduction This release contains previously unreleased isolated-word and spell-mode (spelled out words) speech data from the (D)ARPA Resource Management (RM1) Corpus. This data is based on a 600-word subset of the 991-word RM1 vocabulary and contains spoken and spelled words pertaining to the RM1 naval resource management task. This corpus was collected simultaneously as part of the RM1 Continuous Speech Corpus (NIST Speech Discs 2-1-2-4) and contains speech from the same sets of subjects used in RM1. Data The speech data has been segmented into separate spelled and spoken-word waveform files for each subject-word utterance. Time-aligned word and phonetic transcriptions have been generated automatically using forced recognition and are included. The time-aligned transcriptions employ the same format and phone set as the TIMIT Acoustic-Phonetic Continuous Speech Corpus (NIST Speech Disc 1-1). See the TIMIT CD-ROM companion booklet, NISTIR 4930, pp. 29-31, for a description of the phone set. As with the continuous speech portion of RM1, this data is subsetted into speaker-independent and speaker-dependent partitions. These data sets are further partioned into training, development-test and evaluation-test subsets. See the "readme.doc" file in the top-level directory for more information about the data. Texas Instruments recruited the subjects and collected the speech. The National Institute of Standards and Technology (NIST) segmented the waveforms, generated the time-aligned transcriptions and produced this release. Updates RM Isolated and Spelled Word Data is no longer available as catalog number LDC96S39; it has been incorporated into Resource Management RM1 2.0, and it is currently available in both Resource Management RM1 2.0 (LDC93S3B), and Resource Management Complete Set 2.0 (LDC93S3A).
Format:		Sampling Rate: 16000
Format:		Sampling Format: 1-channel pcm
Identifier:		LDC96S39
		https://catalog.ldc.upenn.edu/LDC96S39
		ISBN: 1-58563-106-X
		ISLRN: 819-670-687-754-1
		DOI: 10.35111/0bnw-zb85
Language:		English
Language (ISO639):		eng
License:		LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC96S39
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC96S39
DateStamp:		2020-11-30
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		NIST Multimodal Information Group. 1996. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text