OLAC Record

Title:Czech Relationship Extraction Dataset
Bibliographic Citation:http://hdl.handle.net/11234/1-3265
Creator:Šimečková, Zuzana
Straka, Milan
Date (W3CDTF):2020-07-31T13:31:42Z
Date Available:2020-07-31T13:31:42Z
Description:CERED (Czech Relationship Dataset) is a family of datasets created via distant supervision on Czech Wikipedia and Wikidata. It was created as part of a thesis on Relationship Extraction (2020). CERED0 is the largest dataset, it lacks negative relation and its relation inventory is huge. CERED*n* is a subset of CERED*n-1* that satisfies some conditions. The methodology of curating the datasets is detailed in the thesis. The format of the data is jsonL and the tools used to generate the dataset is python.
Identifier (URI):http://hdl.handle.net/11234/1-3265
Language (ISO639):ces
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Subject:entity relationship
relationship extraction
Type (DCMI):Text
Type (OLAC):primary_text


Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-3265
DateStamp:  2021-03-22
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Šimečková, Zuzana; Straka, Milan. 2020. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Europe country_CZ dcmi_Text iso639_ces olac_primary_text

Up-to-date as of: Tue Mar 23 7:07:39 EDT 2021