OLAC Record: CSLU: Voices

OLAC Record
oai:www.ldc.upenn.edu:LDC2006S01

Metadata

Title: CSLU: Voices

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Kain, Alexander. CSLU: Voices LDC2006S01. Web Download. Philadelphia: Linguistic Data Consortium, 2006

Contributor: Kain, Alexander

Date (W3CDTF): 2006

Date Issued (W3CDTF): 2006-01-19

Description: *Introduction* CSLU: Voices was developed by Alexander Kain and consists of approximately two hours of read speech in English and includes associated transcripts, laryngograph signals, pitch marks, and phonetic labels. The corpus was created for Kain's Ph.D. dissertation work on high resolution voice transformation (VT) and contains 12 speakers reading 50 phonetically rich sentences. VT is a technology that modifies a source speaker's speech utterance to sound as if a target speaker had spoken it. The purpose of this corpus is to aid VT research and development by providing naturally time-aligned sentences. Consequently, removal of individual prosodic characteristcs, such as fundamental pitch and durations, requires only very little processing and results in high-quality speech samples that only differ in their segmental properties, which is the focus of transformation. These "prosody-normalized" speech samples are used for training VT systems, as well as for evaluating their transformation performance objectively and subjectively. *Data* The recording procedure involved a "mimicking" approach which resulted in a high degree of natural time-alignment between different speakers. The acoustic wave and the concurrent laryngograph signal were recorded for one "free" and two "mimicked" renditions of each sentence. Laryngograph signals, pitch marks calculated from the laryngograph, and time marks from a forced-alignment algorithm, have been added to the corpus. The corpus includes seven male speakers and five female speakers. *Samples* For an example of the data contained in this publication, please review the following samples. * Concurrent laryngograph (LAR) * Pitch marks derived from laryngograph signal (PMV) * Transcription (TXT) * Wave file of speech (WAV) *Updates* None at this time.

Extent: Corpus size: 715776 KB

Format: Sampling Rate: 22050

Sampling Format: pcm

Identifier: LDC2006S01

https://catalog.ldc.upenn.edu/LDC2006S01

ISBN: 1-58563-363-1

ISLRN: 960-768-408-027-3

DOI: 10.35111/7vr2-b249

Language: English

Language (ISO639): eng

License: CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2006S01

Rights Holder: Portions © 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2006 Trustees of the University of Pennsylvsania

Type (DCMI): Sound

Text

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2006S01

DateStamp: 2021-07-16

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Kain, Alexander. 2006. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound dcmi_Text iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2006S01
Up-to-date as of: Wed Oct 29 7:00:54 EDT 2025

Metadata
Title:		CSLU: Voices
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Kain, Alexander. CSLU: Voices LDC2006S01. Web Download. Philadelphia: Linguistic Data Consortium, 2006
Contributor:		Kain, Alexander
Date (W3CDTF):		2006
Date Issued (W3CDTF):		2006-01-19
Description:		Introduction CSLU: Voices was developed by Alexander Kain and consists of approximately two hours of read speech in English and includes associated transcripts, laryngograph signals, pitch marks, and phonetic labels. The corpus was created for Kain's Ph.D. dissertation work on high resolution voice transformation (VT) and contains 12 speakers reading 50 phonetically rich sentences. VT is a technology that modifies a source speaker's speech utterance to sound as if a target speaker had spoken it. The purpose of this corpus is to aid VT research and development by providing naturally time-aligned sentences. Consequently, removal of individual prosodic characteristcs, such as fundamental pitch and durations, requires only very little processing and results in high-quality speech samples that only differ in their segmental properties, which is the focus of transformation. These "prosody-normalized" speech samples are used for training VT systems, as well as for evaluating their transformation performance objectively and subjectively. Data The recording procedure involved a "mimicking" approach which resulted in a high degree of natural time-alignment between different speakers. The acoustic wave and the concurrent laryngograph signal were recorded for one "free" and two "mimicked" renditions of each sentence. Laryngograph signals, pitch marks calculated from the laryngograph, and time marks from a forced-alignment algorithm, have been added to the corpus. The corpus includes seven male speakers and five female speakers. Samples For an example of the data contained in this publication, please review the following samples. * Concurrent laryngograph (LAR) * Pitch marks derived from laryngograph signal (PMV) * Transcription (TXT) * Wave file of speech (WAV) Updates None at this time.
Extent:		Corpus size: 715776 KB
Format:		Sampling Rate: 22050
Format:		Sampling Format: pcm
Identifier:		LDC2006S01
		https://catalog.ldc.upenn.edu/LDC2006S01
		ISBN: 1-58563-363-1
		ISLRN: 960-768-408-027-3
		DOI: 10.35111/7vr2-b249
Language:		English
Language (ISO639):		eng
License:		CSLU Agreement: https://catalog.ldc.upenn.edu/license/cslu-corpora-non-commercial-research-only.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2006S01
Rights Holder:		Portions © 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2006 Trustees of the University of Pennsylvsania
Type (DCMI):		Sound
Type (DCMI):		Text
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2006S01
DateStamp:		2021-07-16
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Kain, Alexander. 2006. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound dcmi_Text iso639_eng olac_primary_text