OLAC Record
oai:lindat.mff.cuni.cz:11234/1-4629

Metadata
Title:Universal Segmentations 1.0 (UniSegments 1.0)
Bibliographic Citation:http://hdl.handle.net/11234/1-4629
Creator:Žabokrtský, Zdeněk
Bafna, Nyati
Bodnár, Jan
Kyjánek, Lukáš
Svoboda, Emil
Ševčíková, Magda
Vidra, Jonáš
Angle, Sachi
Ansari, Ebrahim
Arkhangelskiy, Timofey
Batsuren, Khuyagbaatar
Bella, Gábor
Bertinetto, Pier Marco
Bonami, Olivier
Celata, Chiara
Daniel, Michael
Fedorenko, Alexei
Filko, Matea
Giunchiglia, Fausto
Haghdoost, Hamid
Hathout, Nabil
Khomchenkova, Irina
Khurshudyan, Victoria
Levonian, Dmitri
Litta, Eleonora
Medvedeva, Maria
Muralikrishna, S. N.
Namer, Fiammetta
Nikravesh, Mahshid
Padó, Sebastian
Passarotti, Marco
Plungian, Vladimir
Polyakov, Alexey
Potapov, Mihail
Pruthwik, Mishra
Rao B, Ashwath
Rubakov, Sergei
Samar, Husain
Sharma, Dipti Misra
Šnajder, Jan
Šojat, Krešimir
Štefanec, Vanja
Talamo, Luigi
Tribout, Delphine
Vodolazsky, Daniil
Vydrin, Arseniy
Zakirova, Aigul
Zeller, Britta
Date (W3CDTF):2022-01-24T15:25:57Z
Date Available:2022-01-24T15:25:57Z
Description:Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation scheme for many languages. The annotation scheme consists of simple tab-separated columns that stores a word and its morphological segmentations, including pieces of information about the word and the segmented units, e.g., part-of-speech categories, type of morphs/morphemes etc. The current public version of the collection contains 38 harmonised segmentation datasets covering 30 different languages.
Identifier (URI):http://hdl.handle.net/11234/1-4629
Language:Czech
Catalan
German
English
Persian
Finnish
French
Serbo-Croatian
Croatian
Hungarian
Italian
Komi-Zyrian
Latin
Moksha
Mari (Russia)
Mongolian
Erzya
Polish
Portuguese
Russian
Spanish
Swedish
Tajik
Udmurt
Armenian
Bengali
Hindi
Malayalam
Marathi
Kannada
Language (ISO639):ces
cat
deu
eng
fas
fin
fra
hbs
hrv
hun
ita
kpv
lat
mdf
chm
mon
myv
pol
por
rus
spa
swe
tgk
udm
hye
ben
hin
mal
mar
kan
Publisher:Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Rights:Universal Segmentations 1.0 License Terms
https://lindat.mff.cuni.cz/repository/xmlui/page/licence-unisegs-1.0
Subject:universal segmentations
morphological segmentation
word segmentation
segmentation
morphology
morphemes
morphological dictionary
unisegments
morph
multilingual
Czech language
Catalan language
German language
English language
Persian language
Finnish language
French language
Serbo-Croatian language
Croatian language
Hungarian language
Italian language
Komi-Zyrian language
Latin language
Moksha language
Mari (Russia) language
Mongolian language
Erzya language
Polish language
Portuguese language
Russian language
Spanish language
Swedish language
Tajik language
Udmurt language
Armenian language
Bengali language
Hindi language
Malayalam language
Marathi language
Kannada language
Subject (ISO639):ces
cat
deu
eng
fas
fin
fra
hbs
hrv
hun
ita
kpv
lat
mdf
chm
mon
myv
pol
por
rus
spa
swe
tgk
udm
hye
ben
hin
mal
mar
kan
Type:lexicalConceptualResource
Type (DCMI):Text
Type (OLAC):lexicon

OLAC Info

Archive:  LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University
Description:  http://www.language-archives.org/archive/lindat.mff.cuni.cz
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:lindat.mff.cuni.cz:11234/1-4629
DateStamp:  2022-01-24
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Žabokrtský, Zdeněk; Bafna, Nyati; Bodnár, Jan; Kyjánek, Lukáš; Svoboda, Emil; Ševčíková, Magda; Vidra, Jonáš; Angle, Sachi; Ansari, Ebrahim; Arkhangelskiy, Timofey; Batsuren, Khuyagbaatar; Bella, Gábor; Bertinetto, Pier Marco; Bonami, Olivier; Celata, Chiara; Daniel, Michael; Fedorenko, Alexei; Filko, Matea; Giunchiglia, Fausto; Haghdoost, Hamid; Hathout, Nabil; Khomchenkova, Irina; Khurshudyan, Victoria; Levonian, Dmitri; Litta, Eleonora; Medvedeva, Maria; Muralikrishna, S. N.; Namer, Fiammetta; Nikravesh, Mahshid; Padó, Sebastian; Passarotti, Marco; Plungian, Vladimir; Polyakov, Alexey; Potapov, Mihail; Pruthwik, Mishra; Rao B, Ashwath; Rubakov, Sergei; Samar, Husain; Sharma, Dipti Misra; Šnajder, Jan; Šojat, Krešimir; Štefanec, Vanja; Talamo, Luigi; Tribout, Delphine; Vodolazsky, Daniil; Vydrin, Arseniy; Zakirova, Aigul; Zeller, Britta. 2022. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL).
Terms: area_Asia area_Europe country_AM country_BD country_CZ country_DE country_ES country_FI country_FR country_GB country_HR country_HU country_IN country_IT country_PL country_PT country_RU country_SE country_TJ country_VA dcmi_Text iso639_ben iso639_cat iso639_ces iso639_chm iso639_deu iso639_eng iso639_fas iso639_fin iso639_fra iso639_hbs iso639_hin iso639_hrv iso639_hun iso639_hye iso639_ita iso639_kan iso639_kpv iso639_lat iso639_mal iso639_mar iso639_mdf iso639_mon iso639_myv iso639_pol iso639_por iso639_rus iso639_spa iso639_swe iso639_tgk iso639_udm olac_lexicon

Inferred Metadata

Country: ArmeniaBangladeshCzech RepublicGermanySpainFinlandFranceUnited KingdomCroatiaHungaryIndiaItalyPolandPortugalRussian FederationSwedenTajikistanVatican State
Area: AsiaEurope


http://www.language-archives.org/item.php/oai:lindat.mff.cuni.cz:11234/1-4629
Up-to-date as of: Thu Oct 5 0:43:08 EDT 2023