OLAC Record
oai:sldr.org:sldr000770

Metadata
Title:The Open ANC (OANC)
Abstract:The following corpora are included:

Spoken
- Charlotte
- Switchboard

Written
- Eggan (fiction)
- Slate
- Verbatim
- ICIC
- OUP
- 911 Report
- Biomed
- Govenment
- PLOS
- Berlitz

The following annotations are also included:
- Structural markup (divisions, paragraphs) etc. down to the paragraph level.
- Sentence boundaries.
- Tokens with Hepple (Penn) part of speech annotations.
- Noun chunks
- Verb chunks
Access Rights:Free access
The ANC has so far released 22 million words of American English, which is available from the Linguistic Data Consortium.
© Ide, Nancy, and Suderman, Keith (2007). The Open American National Corpus (OANC). http://www.AmericanNationalCorpus.org/OANC
Documents librement communicables. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
Documents freely communicated. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
自由地被传达的文件 (Code du Patrimoine, 艺术。L. 211-1, L. 211-4, L. 213-1)
Documentos libremente comunicables. (Code du Patrimoine, art. L. 211-1, L. 211-4, L. 213-1)
Alternative Title:Corpus abierto del lenguaje americano
Corpus ouvert de l'américain
打开语料库的美国语言
Bibliographic Citation:The Open ANC (OANC) (Nancy Ide, Randi Reppen, Keith Suderman). Primary data (corpus). Department of Computer Science, Vassar College (New York US). Created 2011-05-29. Speech and Language Data Repository (SLDR/ORTOLANG). Identifier hdl:11041/sldr000770 - Archived ark:/87895/1.4-183691
The Open ANC (OANC) - Corpus abierto del lenguaje americano (Nancy Ide, Randi Reppen, Keith Suderman). Datos primarios (corpus). Department of Computer Science, Vassar College (New York US). Creación 2011-05-29. Banco de datos de habla y lenguaje (SLDR/ORTOLANG). Identificador hdl:11041/sldr000770 - Archived ark:/87895/1.4-183691
The Open ANC (OANC) - Corpus ouvert de l'américain (Nancy Ide, Randi Reppen, Keith Suderman). Données primaires (corpus). Department of Computer Science, Vassar College (New York US). Création 2011-05-29. Banque de données parole et langage (SLDR/ORTOLANG). Identifiant hdl:11041/sldr000770 - Archived ark:/87895/1.4-183691
The Open ANC (OANC) - 打开语料库的美国语言 (Nancy Ide, Randi Reppen, Keith Suderman). 语音库. Department of Computer Science, Vassar College (New York US). 创建 2011-05-29. Speech and Language Data Repository (SLDR/ORTOLANG). 标识符 hdl:11041/sldr000770 - Archived ark:/87895/1.4-183691
Contributor (URI):http://www.cs.vassar.edu
Contributor (author):Ide, Nancy
Reppen, Randi
Suderman, Keith
Contributor (depositor):Department of Computer Science, Vassar College (New York US)
Contributor (sponsor):National Science Foundation (BCS-98009, KDI, SBE)
TalkBank project
Creator:Ide, Nancy
Reppen, Randi
Suderman, Keith
Date (W3CDTF):2011-05-31
Date Created (W3CDTF):2011-05-29
Date Modified (W3CDTF):2011-05-31
Description:The American National Corpus (ANC) project is creating a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. The ANC will provide the most comprehensive picture of American English ever created, and will serve as a resource for education, linguistic and lexicographic research, and technology development. This open portion of the American National Corpus (OANC) contains approximately 15 millions words from the full corpus.
Le projet American National Corpus (ANC) est en train de rassembler une collection volumineuse sur l'anglais américain qui comprend des textes de tous genres et des transcriptions de paroles à partir de 1990. L'ANC fournira l'image la plus complète de l'anglais américain construite à ce jour, servant de ressource pour l'enseignement, la recherche linguistique et lexicographique, ainsi que les technologies de la langue. Ce fragment en libre accès de l'American National Corpus (OANC) contient environ 15 millions de mots du corpus d'origine.
Extent:7419711917
Format (IMT):application/xml
application/zip
Has Format (URI):http://www.cs.vassar.edu/~ide/papers/LAF.pdf
Has Version (URI):http://www.anc.org/OANC/OANC_GrAF.zip
http://www.anc.org/OANC/OANC_GrAF.tgz
Identifier (URI):http://hdl.handle.net/11041/sldr000770
http://hdl.handle.net/11041/sldr000770?urlappend=/preview/picto.jpg
http://hdl.handle.net/11041/sldr000770?urlappend=/toc
http://sldr.org/ark:/87895/1.4-183691
http://sldr.org/ark:/87895/1.4-183706
http://sldr.org/ark:/87895/1.4-183705
http://sldr.org/ark:/87895/1.4-183707
http://sldr.org/ark:/87895/1.4-183709
http://sldr.org/ark:/87895/1.4-183708
http://sldr.org/ark:/87895/1.4-183710
http://hdl.handle.net/11041/sldr000770/rdf.html
http://hdl.handle.net/11041/sldr000770/OANC_GrAF.zip
http://hdl.handle.net/11041/sldr000770/olac.xml
http://hdl.handle.net/11041/sldr000770/oai_dc.xml
Is Part Of (URI):http://www.AmericanNationalCorpus.org
Is Referenced By (URI):http://www.cs.vassar.edu/~ide/pubs.html
Language:English, American
English; Inglés americano
English; anglais américain
English; 英语, American
Language (ISO639):eng
Provenance:Department of Computer Science, Vassar College (New York US)
long-term preservation
conservación a largo plazo
archive pérenne
Provenance (URI):http://www.cs.vassar.edu
Publisher:Department of Computer Science, Vassar College (New York US)
Publisher (URI):http://www.cs.vassar.edu
Rights:info:eu-repo/date/submitted/2011-05-29
info:eu-repo/semantics/openAccess
Spatial Coverage (ISO3166):US
Subject:Information Retrieval
Parsing
Sense Disambiguation
Discourse Modeling
Language Teaching
Text Databases
Human Machine Communication
Recherche d'information
désambiguation du sens
modélisation du discours
enseignement des langues
bases de données textuelles
communication humain-machine
English language
English, American
Inglés americano
anglais américain
英语, American
Subject (ISO639):eng
Subject (OLAC):text_and_corpus_linguistics
discourse_analysis
language_documentation
Table Of Contents:VERSION HISTORY:
Version 1 of this archive was published on 2010-10-01
Type:info:eu-repo/semantics/dataset
Type (DCMI):Sound
Type (Discourse):narrative
Type (OLAC):primary_text

OLAC Info

Archive:  Speech and Language Data Repository (SLDR/ORTOLANG)
Description:  http://www.language-archives.org/archive/sldr.org
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:sldr.org:sldr000770
DateStamp:  2020-12-13
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Ide, Nancy; Reppen, Randi; Suderman, Keith. 2011. Department of Computer Science, Vassar College (New York US).
Terms: area_Europe country_GB dcmi_Sound iso639_eng olac_discourse_analysis olac_language_documentation olac_narrative olac_primary_text olac_text_and_corpus_linguistics

Inferred Metadata

Country: United Kingdom
Area: Europe


http://www.language-archives.org/item.php/oai:sldr.org:sldr000770
Up-to-date as of: Mon Dec 14 8:00:25 EST 2020