OLAC Record
oai:lindat.mff.cuni.cz:11372/LRT-343

Metadata
Title:NoWaC (Norwegian Web as Corpus)
Bibliographic Citation:http://hdl.handle.net/11372/LRT-343
Contributor:Guevara, Emiliano
Johannessen, Janne Bondi
Date (W3CDTF):2014-07-30T21:18:10Z
Date Available:2014-07-30T21:18:10Z
Description:Large web-based corpus of Bokmål Norwegian currently containing about 700 million tokens. The corpus has been built by crawling, downloading and processing web documents in the .no top-level internet domain between November 2009 and January 2010.
Identifier (URI):http://hdl.handle.net/11372/LRT-343
Language:Norwegian
Language (ISO639):nor
Publisher:Department of Linguistics and Nordic Studies, University of Oslo
Rights:Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic license.
Subject:web corpus
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11372/LRT-343
DateStamp:  2016-04-06
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Guevara, Emiliano; Johannessen, Janne Bondi. 2014. Department of Linguistics and Nordic Studies, University of Oslo.
Terms: area_Europe country_NO dcmi_Text iso639_nor olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11372/LRT-343
Up-to-date as of: Sat Apr 29 1:30:10 EDT 2017