OLAC Record: Mixer 6 Speech

OLAC Record
oai:www.ldc.upenn.edu:LDC2013S03

Metadata

Title: Mixer 6 Speech

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Brandschain, Linda, et al. Mixer 6 Speech LDC2013S03. Web Download. Philadelphia: Linguistic Data Consortium, 2013

Contributor: Brandschain, Linda

Graff, David

Walker, Kevin

Cieri, Christopher

Date (W3CDTF): 2013

Date Issued (W3CDTF): 2013-08-19

Description: *Introduction* Mixer 6 Speech was developed by the Linguistic Data Consortium (LDC) and comprises 15,863 hours of audio recordings of interviews, transcript readings and conversational telephone speech involving 594 distinct native English speakers. This material was collected by LDC in 2009 and 2010 as part of the Mixer project, specifically phase 6, the focus of which was on native American English speakers local to the Philadelphia area. The speech data in this release was collected by LDC at its Human Subjects Collection facilities in Philadelphia. The telephone collection protocol was similar to other LDC telephone studies (e.g., Switchboard-2 Phase III Audio - LDC2002S06): recruited speakers were connected through a robot operator to carry on casual conversations lasting up to 10 minutes, usually about a daily topic announced by the robot operator at the start of the call. The raw digital audio content for each call side was captured as a separate channel, and each full conversation was presented as a 2-channel interleaved audio file, with 8000 samples/second and u-law sample encoding. Each speaker was asked to complete 15 calls. The multi-microphone portion of the collection utilized 14 distinct microphones installed identically in two mutli-channel audio recording rooms at LDC. Each session was guided by collection staff using prompting and recording software to conduct the following activities: (1) repeat questions (less than one minute), (2) informal conversation (typically 15 minutes), (3) transcript reading (approximately 15 minutes) and (4) telephone call (generally 10 minutes). Speakers recorded up to three 45-minute sessions on distinct days. The 14 channels were recorded synchronously into separate single-channel files, using 16-bit PCM sample encoding at 16000 samples/second. Certain demographic information about the speakers was collected, including date of birth, level of education, native language, other language capability, place of birth, place of residence and occupation. The recordings in this corpus were used in NIST Speaker Recognition Evaluation (SRE) test sets for 2010. Researchers interested in applying those benchmark test sets should consult the respective NIST Evaluation Plans for guidelines on allowable training data for those tests. *Data* The collection contains 4,410 recordings made via the public telephone network and 1,425 sessions of multiple microphone recordings in office-room settings. The telephone recordings are presented as 8-KHz 2-channel NIST SPHERE files, and the microphone recordings are 16-KHz 1-channel flac/ms-wav files. All audio files names indicate the date and time when the recording began, along with other identifying information, as follows: Telephone: {yyyymmdd}_{hrmnsc}_{callid}.sph Microphone: {yyyymmdd}_{hrmnsc}_{room}_{subjid}_CH{nn}.flac * yyyymmdd is the year, month and date of recording. * hrmnsc is the hour, minute and second when recording began * callid is a unique, incremental number assigned to each call * room is either LDC or HRM, indicating which office was used * subjid is a numeric identifier assigned to the speaker When the flac files are uncompressed, they become ms-wav/RIFF files (flac compression does not presently support SPHERE file format). The telephone audio is presented in SPHERE format because (a) this is consistent with other telephone audio releases from LDC, and (b) flac does not support ulaw sample encoding. The current release of the open-source SoX utility is able to handle both formats as input. Other utilities are available for both flac and SPHERE formats. *Samples* Please listen to this audio sample. *Updates* None at this time. *Additional Licensing Instructions* This 'members-only' corpus is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.

Extent: Corpus size: 562968440 KB

Format: Sampling Rate: 16000

Sampling Format: 1-channel pcm

Identifier: LDC2013S03

https://catalog.ldc.upenn.edu/LDC2013S03

ISBN: 1-58563-652-5

ISLRN: 067-355-674-551-6

DOI: 10.35111/s1w9-y411

Language: English

Language (ISO639): eng

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2013S03

Rights Holder: Portions © 2009-2010, 2013 Trustees of the University of Pennsylvania

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2013S03

DateStamp: 2023-07-14

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Brandschain, Linda; Graff, David; Walker, Kevin; Cieri, Christopher. 2013. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2013S03
Up-to-date as of: Wed Oct 29 7:01:23 EDT 2025

Metadata
Title:		Mixer 6 Speech
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Brandschain, Linda, et al. Mixer 6 Speech LDC2013S03. Web Download. Philadelphia: Linguistic Data Consortium, 2013
Contributor:		Brandschain, Linda
		Graff, David
		Walker, Kevin
		Cieri, Christopher
Date (W3CDTF):		2013
Date Issued (W3CDTF):		2013-08-19
Description:		Introduction Mixer 6 Speech was developed by the Linguistic Data Consortium (LDC) and comprises 15,863 hours of audio recordings of interviews, transcript readings and conversational telephone speech involving 594 distinct native English speakers. This material was collected by LDC in 2009 and 2010 as part of the Mixer project, specifically phase 6, the focus of which was on native American English speakers local to the Philadelphia area. The speech data in this release was collected by LDC at its Human Subjects Collection facilities in Philadelphia. The telephone collection protocol was similar to other LDC telephone studies (e.g., Switchboard-2 Phase III Audio - LDC2002S06): recruited speakers were connected through a robot operator to carry on casual conversations lasting up to 10 minutes, usually about a daily topic announced by the robot operator at the start of the call. The raw digital audio content for each call side was captured as a separate channel, and each full conversation was presented as a 2-channel interleaved audio file, with 8000 samples/second and u-law sample encoding. Each speaker was asked to complete 15 calls. The multi-microphone portion of the collection utilized 14 distinct microphones installed identically in two mutli-channel audio recording rooms at LDC. Each session was guided by collection staff using prompting and recording software to conduct the following activities: (1) repeat questions (less than one minute), (2) informal conversation (typically 15 minutes), (3) transcript reading (approximately 15 minutes) and (4) telephone call (generally 10 minutes). Speakers recorded up to three 45-minute sessions on distinct days. The 14 channels were recorded synchronously into separate single-channel files, using 16-bit PCM sample encoding at 16000 samples/second. Certain demographic information about the speakers was collected, including date of birth, level of education, native language, other language capability, place of birth, place of residence and occupation. The recordings in this corpus were used in NIST Speaker Recognition Evaluation (SRE) test sets for 2010. Researchers interested in applying those benchmark test sets should consult the respective NIST Evaluation Plans for guidelines on allowable training data for those tests. Data The collection contains 4,410 recordings made via the public telephone network and 1,425 sessions of multiple microphone recordings in office-room settings. The telephone recordings are presented as 8-KHz 2-channel NIST SPHERE files, and the microphone recordings are 16-KHz 1-channel flac/ms-wav files. All audio files names indicate the date and time when the recording began, along with other identifying information, as follows: Telephone: {yyyymmdd}_{hrmnsc}_{callid}.sph Microphone: {yyyymmdd}_{hrmnsc}_{room}_{subjid}_CH{nn}.flac * yyyymmdd is the year, month and date of recording. * hrmnsc is the hour, minute and second when recording began * callid is a unique, incremental number assigned to each call * room is either LDC or HRM, indicating which office was used * subjid is a numeric identifier assigned to the speaker When the flac files are uncompressed, they become ms-wav/RIFF files (flac compression does not presently support SPHERE file format). The telephone audio is presented in SPHERE format because (a) this is consistent with other telephone audio releases from LDC, and (b) flac does not support ulaw sample encoding. The current release of the open-source SoX utility is able to handle both formats as input. Other utilities are available for both flac and SPHERE formats. Samples Please listen to this audio sample. Updates None at this time. Additional Licensing Instructions This 'members-only' corpus is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.
Extent:		Corpus size: 562968440 KB
Format:		Sampling Rate: 16000
Format:		Sampling Format: 1-channel pcm
Identifier:		LDC2013S03
		https://catalog.ldc.upenn.edu/LDC2013S03
		ISBN: 1-58563-652-5
		ISLRN: 067-355-674-551-6
		DOI: 10.35111/s1w9-y411
Language:		English
Language (ISO639):		eng
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2013S03
Rights Holder:		Portions © 2009-2010, 2013 Trustees of the University of Pennsylvania
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2013S03
DateStamp:		2023-07-14
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Brandschain, Linda; Graff, David; Walker, Kevin; Cieri, Christopher. 2013. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text