OLAC Record
oai:www.ldc.upenn.edu:LDC99T41

Metadata
Title:Spanish Newswire Text, Volume 2
Access Rights:Licensing Instructions for Subscription & Standard Members, and Non-Members: http://www.ldc.upenn.edu/language-resources/data/obtaining
Bibliographic Citation:Graff, David, and Gustavo Gallegos. Spanish Newswire Text, Volume 2 LDC99T41. Web Download. Philadelphia: Linguistic Data Consortium, 1999
Contributor:Graff, David
Gallegos, Gustavo
Date (W3CDTF):1999
Description:*Introduction* This release of Spanish newswire contains data from the following sources: * Agence France Presse (January 13, 1996--December 13,1998) * Associated Press Worldstream (December 1, 1995--August 31, 1998) * El Norte (January 1, 1997--December 31, 1998) *Data* The consistent format chosen for release consists of SGML tagging and the ISO-8859-1 (Latin1) 8-bit character set. Our general strategy for SGML tagging is as follows: All document units (articles) are bounded by the tags DOC and /DOC, and within these units, the text content of each article is bounded by TEXT and /TEXT. Following each DOC tag is a DOCID tag that provides a unique identifying string for that article. Other tags within the DOC unit (but external to TEXT) provide additional information that was receieved with the article (e.g. headline, dateline, byline, keywords, etc), but the inventory and nature of additional information varies from one source to the next (and in some cases, from one article to the next), and this variability is reflected in the SGML tags that are used to preserve the information. Within the TEXT units, tagging is kept to a minimum, typically consisting only of paragraph tags. *Updates* There are no updates at this time.
Identifier:LDC99T41
https://catalog.ldc.upenn.edu/LDC99T41
ISBN: 1-58563-162-0
ISLRN: 581-480-117-182-1
DOI: 10.35111/q9gq-nf98
Language:Spanish
Language (ISO639):spa
License:LDC User Agreement for Non-Members: https://catalog.ldc.upenn.edu/license/ldc-non-members-agreement.pdf
Medium:Distribution: Web Download
Publisher:Linguistic Data Consortium
Publisher (URI):https://www.ldc.upenn.edu
Relation (URI):https://catalog.ldc.upenn.edu/docs/LDC99T41
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  The LDC Corpus Catalog
Description:  http://www.language-archives.org/archive/www.ldc.upenn.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:www.ldc.upenn.edu:LDC99T41
DateStamp:  2020-11-30
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Graff, David; Gallegos, Gustavo. 1999. Linguistic Data Consortium.
Terms: area_Europe country_ES dcmi_Text iso639_spa olac_primary_text


http://www.language-archives.org/item.php/oai:www.ldc.upenn.edu:LDC99T41
Up-to-date as of: Mon Mar 25 7:20:07 EDT 2024