OLAC Record
oai:lindat.mff.cuni.cz:11234/1-3126

Metadata
Title:Large Corpus of Czech Parliament Plenary Hearings
Bibliographic Citation:http://hdl.handle.net/11234/1-3126
Creator:Kratochvíl, Jonáš
Polák, Peter
Bojar, Ondřej
Date (W3CDTF):2019-12-11T15:25:08Z
Date Available:2019-12-11T15:25:08Z
Description:We present a large corpus of Czech parliament plenary sessions. The corpus consists of approximately 444 hours of speech data and corresponding text transcriptions. The whole corpus has been segmented to short audio snippets making it suitable for both training and evaluation of automatic speech recognition (ASR) systems. The source language of the corpus is Czech, which makes it a valuable resource for future research as only a few public datasets are available for the Czech language.
Identifier (URI):http://hdl.handle.net/11234/1-3126
Language:Czech
Language (ISO639):ces
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Creative Commons - Attribution 4.0 International (CC BY 4.0)
http://creativecommons.org/licenses/by/4.0/
Subject:ASR
Czech
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-3126
DateStamp:  2021-03-22
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Kratochvíl, Jonáš; Polák, Peter; Bojar, Ondřej. 2019. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_CZ dcmi_Text iso639_ces olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-3126
Up-to-date as of: Tue Mar 23 7:07:35 EDT 2021