Title:Prague DaTabase of Spoken Czech 1.0
Bibliographic Citation:http://hdl.handle.net/11234/1-2375
Creator:Hajič, Jan
Pajas, Petr
Ircing, Pavel
Romportl, Jan
Peterek, Nino
Spousta, Miroslav
Mikulová, Marie
Grůber, Martin
Legát, Milan
Date (W3CDTF):2017-11-02T11:31:47Z
Date Available:2017-11-02T11:31:47Z
Description:PDTSC 1.0 is a multi-purpose corpus of spoken language. 768,888 tokens, 73,374 sentences and 7,324 minutes of spontaneous dialog speech have been recorded, transcribed and edited in several interlinked layers: audio recordings, automatic and manual transcription and manually reconstructed text. PDTSC 1.0 is a delayed release of data annotated in 2012. It is an update of Prague Dependency Treebank of Spoken Language (PDTSL) 0.5 (published in 2009). In 2017, Prague Dependency Treebank of Spoken Czech (PDTSC) 2.0 was published as an update of PDTSC 1.0.
Identifier (URI):http://hdl.handle.net/11234/1-2375
Is Replaced By (URI):http://hdl.handle.net/11234/1-3189
Language (ISO639):ces
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
University of West Bohemia
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Subject:spoken corpus
speech recognition
speech reconstruction
Type (DCMI):Text
Type (OLAC):primary_text


