OLAC Record
oai:lindat.mff.cuni.cz:11234/1-2587

Metadata
Title:Amharic Web Corpus
Bibliographic Citation:http://hdl.handle.net/11234/1-2587
Creator:Suchomel, Vít
Rychlý, Pavel
Date (W3CDTF):2018-01-11T15:29:19Z
Date Available:2018-01-11T15:29:19Z
Description:Amharic web corpus. Crawled by SpiderLing in August 2013 and October 2015 and January 2016. Encoded in UTF-8, cleaned, deduplicated. Tagged by TreeTagger trained on Amharic WIC corpus.
Identifier (URI):http://hdl.handle.net/11234/1-2587
Language:Amharic
Language (ISO639):amh
Publisher:Masaryk University, NLP Centre
Rights:NLP Centre Web Corpus License
https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC
Subject:Amharic
text corpus
web corpus
under-resourced language
corpus annotation
morphological tagger
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-2587
DateStamp:  2018-07-02
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Suchomel, Vít; Rychlý, Pavel. 2018. Masaryk University, NLP Centre.
Terms: area_Africa country_ET dcmi_Text iso639_amh olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-2587
Up-to-date as of: Wed Oct 9 8:30:46 EDT 2019