OLAC Record
oai:www.ldc.upenn.edu:LDC98S74

Metadata
Title:1997 Spanish Broadcast News Speech (HUB4-NE)
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Linguistic Data Consortium. 1997 Spanish Broadcast News Speech (HUB4-NE) LDC98S74. Web Download. Philadelphia: Linguistic Data Consortium, 1998
Contributor:Linguistic Data Consortium
Date (W3CDTF):1998
Description:LDC98S74 - Speech data LDC98T29 - Transcripts *Introduction* This corpus contains a portion of the acoustic data designated as the training set for the 1997 DARPA HUB4 Spanish Benchmark. It contains speech and transcripts of 30 hours of broadcast news from the following sources: Televisa, Univision and VOA. *Data* All acoustic files are in NIST SPHERE format, without compression. The sample data are 16-bit linear PCM, 16-KHz sample frequency, single channel. Most files contain 30 minutes of recorded material and some contain 60 or 120 minutes (approximately); the sampling format requires roughly two megabytes (MB) per minute of recording, so the file sizes are typically around 60 MB, with some files ranging up to 120 or 240 MB. The transcripts are in SGML format, using the same markup conventions that have been applied to the other 1997 Broadcast News speech corpora (in English and Mandarin) and are transmitted by FTP, not on the CD-ROMs with speech data. *Updates* There are no updates at this time. *Additional Licensing Instructions* This 'members-only' corpora is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.
Format:Sampling Rate: 16000
Sampling Format: 1-channel pcm
Identifier:LDC98S74
https://catalog.ldc.upenn.edu/LDC98S74
ISBN: 1-58563-127-2
ISLRN: 684-931-706-325-2
DOI: 10.35111/mw6a-ab44
Language:Spanish
Language (ISO639):spa
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC98S74
Rights Holder:Portions © 1997 Televisa S.A. de C.V., © 1997 Univision Network Limited Partnership, © 1997, 1998 Trustees of the University of Pennsylvania
Type (DCMI):Sound
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC98S74
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Linguistic Data Consortium. 1998. Linguistic Data Consortium.
Terms: area_Europe country_ES dcmi_Sound iso639_spa olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC98S74
Up-to-date as of: Mon Mar 25 7:20:03 EDT 2024