OLAC Record
oai:lindat.mff.cuni.cz:11234/1-2859

Metadata
Title:CoNLL 2018 Shared Task - UDPipe Baseline Models and Supplementary Materials
Bibliographic Citation:http://hdl.handle.net/11234/1-2859
Creator:Straka, Milan
Date (W3CDTF):2018-09-05T11:07:23Z
Date Available:2018-09-05T11:07:23Z
Description:Baseline UDPipe models for CoNLL 2018 Shared Task in UD Parsing, and supplementary material. The models require UDPipe version at least 1.2 and are evaluated using the official evaluation script. The models were trained using a custom data split for treebanks where no development data is provided. Also, we trained an additional "Mixed" model, which uses 200 sentences from every training data. All information needed to replicate the model training (hyperparameters, modified train-dev split, and pre-computed word embeddings for the parser) are included in the archive. Additionaly, we provide UD 2.2 CoNLL 2018 training data with automatically predicted morphology. We utilize the baseline models on development data and perform 10-fold jack-knifing (each fold is predicted with a model trained on the rest of the folds) on the training data.
Identifier (URI):http://hdl.handle.net/11234/1-2859
Language:Multiple languages
Language (ISO639):mul
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Licence Universal Dependencies v2.2
https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.2
Subject:CoNLL 2018
tokenizer
POS tagger
lemmatization
tagger
parser
dependency parser
morphology
treebank
Multiple languages
Subject (ISO639):mul
Type:languageDescription
Type (DCMI):Text
Type (OLAC):language_description

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-2859
DateStamp:  2020-02-19
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Straka, Milan. 2018. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: dcmi_Text iso639_mul olac_language_description

Inferred Metadata

Country: 
Area: 


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-2859
Up-to-date as of: Thu Mar 19 14:32:04 EDT 2020