Title:Vystadial 2013 – Czech data
Creator:Korvas, Matěj
Plátek, Ondřej
Dušek, Ondřej
Žilka, Lukáš
Jurčíček, Filip
Description:Vystadial 2013 is a dataset of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems. It ships in three parts: Czech data, English data, and scripts. The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions. The scripts implement data pre-processing and building acoustic models using the HTK and Kaldi toolkits. This is the Czech data part of the dataset.
This research was funded by the Ministry of Education, Youth and Sports of the Czech Republic under the grant agreement LK11221.
Language (ISO639):ces
Publisher:Charles University, Faculty of Mathematics and Physics
Rights:Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
Subject:acoustic data
speech corpus
spoken corpus
orthographic transcriptions
telephone speech
dialogue system
