OLAC Record
oai:clarin.eurac.edu:20.500.12124/26

Metadata
Title:Kolipsi-1 Corpus v1.0
Bibliographic Citation:http://hdl.handle.net/20.500.12124/26
Creator:Glaznieks, Aivars
Frey, Jennifer-Carmen
Abel, Andrea
Vettori, Chiara
Nicolas, Lionel
Date (W3CDTF):2021-05-05T15:52:10Z
Date Available:2021-05-05T15:52:10Z
Description:The Kolipsi-1 L2 is a written learner corpus of German and Italian L2 speakers originating from South Tyrol (Italy). It has been developed as a by-product of the KOLIPSI project “South-Tyrolean pupils and the second language: a linguistic and socio-psychological investigation”. In addition, data from L1 pupils were collected exclusively for the creation of a native speaker reference corpus. The data collection took place in autumn 2007 and is based on two standardized tests for written productions. The two tasks consisted of (1) writing an e-mail to a friend retelling a given event at the supermarket based on a picture story (narrative text genre) and (2) in writing a letter to a friend discussing holiday plans (argumentative text genre). For both tasks a time limit of 30 minutes was fixed and no additional reference material was allowed. CEFR levesl have been assigned to all L2 learner texts, providing a holistic score as well as evaluations of coherence, lexis, grammar and sociolinguistic appropriateness. Person-related metadata provides information about: - the writer's language background, including L1(s), the L1(s) of mother and father, and a self-declared language group affiliation - the writer's age, gender and socio-economic status - the writer's district of residence and whether he lives in an urban or rural environment - the language, location and type of school the writer attended - whether the writer passed the local bilinguality exam or not - an anonymous identifier for the writer's school class and L2 teacher to account for class effects All texts have been transcribed manually adding transcription annotations that reflect surface features of the text, such as the graphical arrangement, and include error annotation on the orthographic level. In addition to that, all texts were automatically annotated, adding tokenisation, sentence splitting, POS-tagging and lemmatization using an orthographically corrected target version of the corpus. Kolipsi-1 L2 belongs to the Kolipsi Corpus Family, a series of related learner corpora collected in South Tyrolean upper secondary schools. The corpora of the Kolipsi Corpus Family contain Italian and German learner texts that were collected in the course of the KOLIPSI project in 2007/2008 (Kolipsi-1) and a follow-up study in 2014/2015 (Kolipsi-2). The aim of both corpus studies was to analyse the second language competences of South-Tyrolean pupils from upper secondary schools (between 16-18 years old), and to contextualize the results of such investigation by commenting on crucial sociolinguistic and psychosocial aspects that influence it. The results of the follow-up study should be compared to the results of the original KOLIPSI project.
Identifier (URI):http://hdl.handle.net/20.500.12124/26
Is Replaced By (URI):http://hdl.handle.net/20.500.12124/64
Language:German
Italian
Language (ISO639):deu
ita
Publisher:Institute for Applied Linguistics, Eurac Research
Rights:CLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0)
https://gitlab.inf.unibz.it/commul/var/eurac-licenses/-/raw/v1.0/EULA-CLARIN-ACA-BY-NC-NORED.md
Subject:L2
Learner corpora
South Tyrol
argumentative essay
students
high school
upper secondary school
picture story
opinion text
Type:corpus
Type (DCMI):Text
Type (OLAC):primary_text

OLAC Info

Archive:  Eurac Research CLARIN Centre
Description:  http://www.language-archives.org/archive/clarin.eurac.edu
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:clarin.eurac.edu:20.500.12124/26
DateStamp:  2023-03-17
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Glaznieks, Aivars; Frey, Jennifer-Carmen; Abel, Andrea; Vettori, Chiara; Nicolas, Lionel. 2021. Institute for Applied Linguistics, Eurac Research.
Terms: area_Europe country_DE country_IT dcmi_Text iso639_deu iso639_ita olac_primary_text


http://www.language-archives.org/item.php/oai:clarin.eurac.edu:20.500.12124/26
Up-to-date as of: Mon Sep 18 0:46:45 EDT 2023