OLAC Record: Digital Archive of Southern Speech

OLAC Record
oai:www.ldc.upenn.edu:LDC2012S03

Metadata

Title: Digital Archive of Southern Speech

Access Rights: Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining

Bibliographic Citation: Kretzschmar Jr., William A., et al. Digital Archive of Southern Speech LDC2012S03. Web Download. Philadelphia: Linguistic Data Consortium, 2012

Contributor: Kretzschmar Jr., William A.

Bounds, Paulina

Hettel, Jacqueline

Coats, Steven

Pederson, Lee

Lisa Lena Opas-Hänninen

Juuso, Ilkka

Seppänen, Tapio

Date (W3CDTF): 2012

Date Issued (W3CDTF): 2012-02-15

Description: *Introduction* Digital Archive of Southern Speech (DASS) was developed by the University of Georgia. It is a subset of the Linguistic Atlas of the Gulf States (LAGS), which is in turn part of the Linguistic Atlas Project (LAP). DASS contains approximately 370 hours of English speech data from 30 female speakers and 34 male speakers in .wav format and in .mp3 format, along with associated metadata about the speakers and the recordings and maps in .jpeg format relating to the recording locations. LAP consists of a set of survey research projects about the words and pronunciation of everyday American English, the largest project of its kind in the United States. Interviews with thousands of native speakers across the country have been carried out since 1929. LAGS surveyed the everyday speech of Georgia, Tennessee, Florida, Alabama, Mississippi, Arkansas, Louisiana, and Texas in a series of 914 audio-taped interviews conducted from 1968-1983. Interviews average approximately six hours in length the systematic LAGS tape archive amounts to 5500 hours of sound recordings. DASS is a collection of 64 interviews from LAGS selected to cover a range of speech across the region and to represent multiple education levels and ethnic backgrounds. This release is distributed on an external hard drive and contains instructions for using the media and navigating to the LICHEN program. Digital Archive of Southern Speech - NLP Version (LDC2016S05), an alternate version suitable for natural language processing and human language technology applications is also available. *Data* The DASS speakers average age is 61 years there are 30 women and 34 men from the Gulf States region represented in this release. The interviews cover common topics such as family, the weather, household articles and activities, agriculture and social connections. The interviews were originally recorded in the field on reel-to-reel audio tape. A digital version of every reel of tape was then made, one .wav file per reel, usually about one hour of sound. Each interview thus consists of a set of 3 to 13 reels, or roughly 3 to 13 interview hours. Personally identifying or sensitive information in the files was replaced with a tone to protect the privacy and to assure ethical treatment of speakers. Each .wav file is split into multiple .mp3 files based on the topic of conversation and labeled thusly. Included spreadsheets provide information about the speakers, the labels used for topics and the sound files. Also included in this release is a version of the LICHEN software developed at the University of Oulu, Finland. LICHEN allows users to browse and search through the audio data in a more advanced fashion using a graphical interface. Further information and instructions for LICHEN can be found within the docs folder of this release. *Updates* None at this time. *Samples* For an example of the data contained in this corpus, review this audio sample. *Authorship* The following people were involved with the DASS project: William A. Kretzschmar, Jr., Paulina Bounds, Jacqueline Hettel and Steven Coats University of Georgia Lee Pederson Emory University Lisa Lena Opas-Hänninen, Ilkka Juuso and Tapio Seppänen University of Oulu (Finland) *Sponsorship* The Atlas Data contained herein comprises information collected in the period spanning from the 1930s to 2010 and has been compiled from diverse sources, by, and under the direction of, Dr. William A. Kretzschmar, Harry and Jane Wilson Professor in Humanities at the Department of English of The University of Georgia. Compilation and digitalization of this work was funded, in part, by the US National Science Foundation and by the US National Endowment for the Humanities. Additional information about the Atlas Project can be obtained at http://www.lap.uga.edu/Home.html.

Extent: Corpus size: 206750107 KB

Format: Sampling Rate: 48000, 16000

Sampling Format: pcm, mpeg

Identifier: LDC2012S03

https://catalog.ldc.upenn.edu/LDC2012S03

ISBN: 1-58563-605-3

ISLRN: 167-450-243-260-2

DOI: 10.35111/5bnt-r659

Language: English

Language (ISO639): eng

License: Digital Archive of Southern Speech For-Profit Member Agreement: https://catalog.ldc.upenn.edu/license/dass-fp-agreement.pdf

LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf

Medium: Distribution: Web Download

Publisher: Linguistic Data Consortium

Publisher (URI): https://www.ldc.upenn.edu

Relation (URI): https://catalog.ldc.upenn.edu/docs/LDC2012S03

Rights Holder: Portions © 1982-2010 American Dialect Society, © 1986-2010 University of Georgia Research Foundation, © 2012 Trustees of the University of Pennsylvania

Type (DCMI): Sound

Type (OLAC): primary_text

OLAC Info

Archive: The LDC Corpus Catalog

Description: http://www.language-archives.org/archive/www.ldc.upenn.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:www.ldc.upenn.edu:LDC2012S03

DateStamp: 2020-11-30

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Kretzschmar Jr., William A.; Bounds, Paulina; Hettel, Jacqueline; Coats, Steven; Pederson, Lee; Lisa Lena Opas-Hänninen; Juuso, Ilkka; Seppänen, Tapio. 2012. Linguistic Data Consortium.
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text

http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC2012S03
Up-to-date as of: Wed Oct 29 7:01:18 EDT 2025

Metadata
Title:		Digital Archive of Southern Speech
Access Rights:		Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:		Kretzschmar Jr., William A., et al. Digital Archive of Southern Speech LDC2012S03. Web Download. Philadelphia: Linguistic Data Consortium, 2012
Contributor:		Kretzschmar Jr., William A.
		Bounds, Paulina
		Hettel, Jacqueline
		Coats, Steven
		Pederson, Lee
		Lisa Lena Opas-Hänninen
		Juuso, Ilkka
		Seppänen, Tapio
Date (W3CDTF):		2012
Date Issued (W3CDTF):		2012-02-15
Description:		Introduction Digital Archive of Southern Speech (DASS) was developed by the University of Georgia. It is a subset of the Linguistic Atlas of the Gulf States (LAGS), which is in turn part of the Linguistic Atlas Project (LAP). DASS contains approximately 370 hours of English speech data from 30 female speakers and 34 male speakers in .wav format and in .mp3 format, along with associated metadata about the speakers and the recordings and maps in .jpeg format relating to the recording locations. LAP consists of a set of survey research projects about the words and pronunciation of everyday American English, the largest project of its kind in the United States. Interviews with thousands of native speakers across the country have been carried out since 1929. LAGS surveyed the everyday speech of Georgia, Tennessee, Florida, Alabama, Mississippi, Arkansas, Louisiana, and Texas in a series of 914 audio-taped interviews conducted from 1968-1983. Interviews average approximately six hours in length the systematic LAGS tape archive amounts to 5500 hours of sound recordings. DASS is a collection of 64 interviews from LAGS selected to cover a range of speech across the region and to represent multiple education levels and ethnic backgrounds. This release is distributed on an external hard drive and contains instructions for using the media and navigating to the LICHEN program. Digital Archive of Southern Speech - NLP Version (LDC2016S05), an alternate version suitable for natural language processing and human language technology applications is also available. Data The DASS speakers average age is 61 years there are 30 women and 34 men from the Gulf States region represented in this release. The interviews cover common topics such as family, the weather, household articles and activities, agriculture and social connections. The interviews were originally recorded in the field on reel-to-reel audio tape. A digital version of every reel of tape was then made, one .wav file per reel, usually about one hour of sound. Each interview thus consists of a set of 3 to 13 reels, or roughly 3 to 13 interview hours. Personally identifying or sensitive information in the files was replaced with a tone to protect the privacy and to assure ethical treatment of speakers. Each .wav file is split into multiple .mp3 files based on the topic of conversation and labeled thusly. Included spreadsheets provide information about the speakers, the labels used for topics and the sound files. Also included in this release is a version of the LICHEN software developed at the University of Oulu, Finland. LICHEN allows users to browse and search through the audio data in a more advanced fashion using a graphical interface. Further information and instructions for LICHEN can be found within the docs folder of this release. Updates None at this time. Samples For an example of the data contained in this corpus, review this audio sample. Authorship The following people were involved with the DASS project: William A. Kretzschmar, Jr., Paulina Bounds, Jacqueline Hettel and Steven Coats University of Georgia Lee Pederson Emory University Lisa Lena Opas-Hänninen, Ilkka Juuso and Tapio Seppänen University of Oulu (Finland) Sponsorship The Atlas Data contained herein comprises information collected in the period spanning from the 1930s to 2010 and has been compiled from diverse sources, by, and under the direction of, Dr. William A. Kretzschmar, Harry and Jane Wilson Professor in Humanities at the Department of English of The University of Georgia. Compilation and digitalization of this work was funded, in part, by the US National Science Foundation and by the US National Endowment for the Humanities. Additional information about the Atlas Project can be obtained at http://www.lap.uga.edu/Home.html.
Extent:		Corpus size: 206750107 KB
Format:		Sampling Rate: 48000, 16000
Format:		Sampling Format: pcm, mpeg
Identifier:		LDC2012S03
		https://catalog.ldc.upenn.edu/LDC2012S03
		ISBN: 1-58563-605-3
		ISLRN: 167-450-243-260-2
		DOI: 10.35111/5bnt-r659
Language:		English
Language (ISO639):		eng
License:		Digital Archive of Southern Speech For-Profit Member Agreement: https://catalog.ldc.upenn.edu/license/dass-fp-agreement.pdf
License:		LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:		Distribution: Web Download
Publisher:		Linguistic Data Consortium
Publisher (URI):		https://www.ldc.upenn.edu
Relation (URI):		https://catalog.ldc.upenn.edu/docs/LDC2012S03
Rights Holder:		Portions © 1982-2010 American Dialect Society, © 1986-2010 University of Georgia Research Foundation, © 2012 Trustees of the University of Pennsylvania
Type (DCMI):		Sound
Type (OLAC):		primary_text
OLAC Info
Archive:		The LDC Corpus Catalog
Description:		http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:www.ldc.upenn.edu:LDC2012S03
DateStamp:		2020-11-30
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Kretzschmar Jr., William A.; Bounds, Paulina; Hettel, Jacqueline; Coats, Steven; Pederson, Lee; Lisa Lena Opas-Hänninen; Juuso, Ilkka; Seppänen, Tapio. 2012. Linguistic Data Consortium.
Terms:		area_Europe country_GB dcmi_Sound iso639_eng olac_primary_text