OLAC Record
oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-10B2-F

Metadata
Title:Many Czech References for 50 Sentences Selected from WMT11 Data
Bibliographic Citation:http://hdl.handle.net/11858/00-097C-0000-0023-10B2-F
Creator:Bojar, Ondřej
Macháček, Matouš
Tamchyna, Aleš
Zeman, Daniel
Date (W3CDTF):2013-12-10T13:41:44Z
Date Available:2013-12-10T13:41:44Z
Description:This dataset contains the whole set of very many Czech translations for 50 English source sentences coming from WMT11 test set (http://www.statmt.org/wmt11). In total, there are 15431447 Czech sentences, i.e. 300k reference translations per source English sentence on average, but the exact number greatly varies across sentences. You can find more details in included README file. If you use this dataset, please cite the following paper which describes the technique used to construct the Czech translations: Bojar Ondřej, Macháček Matouš, Tamchyna Aleš, Zeman Daniel: Scratching the Surface of Possible Translations. Lecture Notes in Computer Science, Vol. 8082, Text, Speech and Dialogue: 16th International Conference, TSD 2013. Proceedings, Copyright © Springer Verlag, Berlin / Heidelberg, ISBN 978-3-642-40584-6, ISSN 0302-9743, pp. 465-474, 2013, DOI: 10.1007/978-3-642-40585-3_59
P406/11/1499 of the Grant Agency of the Czech Republic, FP7-ICT-2011-7-288487 (MosesCore) of the European Union and 1356213 of the Grant Agency of the Charles University
Identifier (URI):http://hdl.handle.net/11858/00-097C-0000-0023-10B2-F
Language:Czech
Language (ISO639):ces
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
http://creativecommons.org/licenses/by-sa/3.0/
Subject:machine translation
automatic machine translation evaluation
reference translation
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-10B2-F
DateStamp:  2021-06-29
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Bojar, Ondřej; Macháček, Matouš; Tamchyna, Aleš; Zeman, Daniel. 2013. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_CZ dcmi_Text iso639_ces olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11858/00-097C-0000-0023-10B2-F
Up-to-date as of: Thu Oct 5 0:38:55 EDT 2023