![]() |
OLAC Record oai:lindat.mff.cuni.cz:11234/1-3195 |
| Metadata | ||
| Title: | Large-Scale Colloquial Persian 0.5 | |
| Bibliographic Citation: | http://hdl.handle.net/11234/1-3195 | |
| Creator: | Abdi Khojasteh, Hadi | |
| Ansari, Ebrahim | ||
| Bohlouli, Mahdi | ||
| Date (W3CDTF): | 2020-03-18T10:44:43Z | |
| Date Available: | 2020-03-18T10:44:43Z | |
| Description: | "Large Scale Colloquial Persian Dataset" (LSCP) is hierarchically organized in asemantic taxonomy that focuses on multi-task informal Persian language understanding as a comprehensive problem. LSCP includes 120M sentences from 27M casual Persian tweets with its dependency relations in syntactic annotation, Part-of-speech tags, sentiment polarity and automatic translation of original Persian sentences in five different languages (EN, CS, DE, IT, HI). | |
| Identifier (URI): | http://hdl.handle.net/11234/1-3195 | |
| Language: | Persian | |
| English | ||
| German | ||
| Czech | ||
| Italian | ||
| Hindi | ||
| Language (ISO639): | fas | |
| eng | ||
| deu | ||
| ces | ||
| ita | ||
| hin | ||
| Publisher: | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) | |
| Institute for Advanced Studies in Basic Sciences (IASBS) | ||
| Rights: | Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) | |
| http://creativecommons.org/licenses/by-nc-nd/4.0/ | ||
| Subject: | PoS tagging | |
| corpus | ||
| annotated corpus | ||
| multilingual | ||
| derivation | ||
| dependency parser | ||
| machine translation | ||
| informal language | ||
| spoken language | ||
| monolingual corpus | ||
| bilingual corpus annotation | ||
| Type: | corpus | |
| Type (DCMI): | Text | |
| Type (OLAC): | primary_text | |
OLAC Info |
||
| Archive: | LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University | |
| Description: | http://www.language-archives.org/archive/lindat.mff.cuni.cz | |
| GetRecord: | OAI-PMH request for OLAC format | |
| GetRecord: | Pre-generated XML file | |
OAI Info |
||
| OaiIdentifier: | oai:lindat.mff.cuni.cz:11234/1-3195 | |
| DateStamp: | 2021-06-29 | |
| GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
| Citation: | Abdi Khojasteh, Hadi; Ansari, Ebrahim; Bohlouli, Mahdi. 2020. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL). | |
| Terms: | area_Asia area_Europe country_CZ country_DE country_GB country_IN country_IT dcmi_Text iso639_ces iso639_deu iso639_eng iso639_fas iso639_hin iso639_ita olac_primary_text | |