OLAC Record
oai:lindat.mff.cuni.cz:11234/1-3189

Metadata
Title:Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0)
Bibliographic Citation:http://hdl.handle.net/11234/1-3189
Creator:Mikulová, Marie
Bémová, Alevtina
Hajič, Jan
Hajičová, Eva
Ircing, Pavel
Kolářová, Veronika
Lopatková, Markéta
Mareček, David
Mírovský, Jiří
Nedoluzhko, Anna
Pajas, Petr
Panevová, Jarmila
Peterek, Nino
Romportl, Jan
Sgall, Petr
Ševčíková, Magda
Štěpánek, Jan
Urešová, Zdeňka
Žabokrtský, Zdeněk
Date (W3CDTF):2021-01-31T22:04:56Z
Date Available:2021-01-31T22:04:56Z
Description:The Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0) is a corpus of spoken language, consisting of 742,316 tokens and 73,835 sentences, representing 7,324 minutes (over 120 hours) of spontaneous dialogs. The dialogs have been recorded, transcribed and edited in several interlinked layers: audio recordings, automatic and manual transcripts and manually reconstructed text. These layers were part of the first version of the corpus (PDTSC 1.0). Version 2.0 is extended by an automatic dependency parser at the analytical and by the manual annotation of “deep” syntax at the tectogrammatical layer, which contains semantic roles and relations as well as annotation of coreference.
Identifier (URI):http://hdl.handle.net/11234/1-3189
Language:Czech
Language (ISO639):ces
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Replaces (URI):http://hdl.handle.net/11234/1-2375
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
http://creativecommons.org/licenses/by-nc-sa/4.0/
Subject:spoken corpus
speech reconstruction
speech recognition
syntax
semantics
coreference
audio
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-3189
DateStamp:  2021-03-22
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Mikulová, Marie; Bémová, Alevtina; Hajič, Jan; Hajičová, Eva; Ircing, Pavel; Kolářová, Veronika; Lopatková, Markéta; Mareček, David; Mírovský, Jiří; Nedoluzhko, Anna; Pajas, Petr; Panevová, Jarmila; Peterek, Nino; Romportl, Jan; Sgall, Petr; Ševčíková, Magda; Štěpánek, Jan; Urešová, Zdeňka; Žabokrtský, Zdeněk. 2021. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_CZ dcmi_Text iso639_ces olac_primary_text


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-3189
Up-to-date as of: Tue Mar 23 7:07:37 EDT 2021