OLAC Record
oai:mulce.org:mce-infral-tagged_blogs

Metadata
Title:Blog's data from Infral structured, tokenized and tagged into a XML file.
Access Rights:open access after registration
Audience:Researchers or teachers in educational sciences or linguistics
Conforms To:IMS-CP for packaging
Contributor (compiler):Chanier Thierry
Contributor (data_inputter):Laurent Mario
Contributor (depositor):Chanier Thierry
Contributor (editor):Chanier Thierry
Contributor (researcher):Laurent Mario ; Chanier Thierry
Creator: BEGIN:VCARD FN:Mario Laurent N:Laurent;Mario ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France END:VCARD
Creator (URI):Mario Laurent
Creator (compiler):Laurent Mario
Date Created (W3CDTF):2011-12-02
Description:This corpus is based on data extracted from the global Learning & Teaching Corpus Infral.(on LETEC, see http://lrl-diffusion.univ-bpclermont.fr/mulce/metadata/mce_LETECorpus-en.pdf ). Structuring language interactions into exploitable corpora is necessary to analyze the data from the Infral project. To understand the development of intercultural competences we have to quantify the production of the different participants, such as language use or lexical diversity. In order to achieve this, we used Python programming language and the NLTK library. During the Infral course, participants from a French and a German university communicated using both languages via blogs. We developed a program that converts plain text from Infral's blogs into a structured XML file where each message is tokenized into words. Each word is tagged according to its form and its original language.
Extent:28 500 ko
Format (IMT):text/xml
application/pdf
Identifier:mce-infral-tagged_blogs
Identifier (URI):http://mulce.univ-bpclermont.fr:8080/PlateFormeMulce/VIEW/PUBLIC/03/VMeta.do?adr=Infral%2FCorpus_objets%2Fmce-infral-tagged_blogs
Language:French
German
Language (ISO639):fra
deu
Publisher: BEGIN:VCARD FN:Mulce (MULtimodal Corpus Exchange) ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France URL:http://mulce.org END:VCARD
References:Abendroth-Timmer, D., Bechtel, M., Chanier, T. & Ciekanski, M. (2009) "From developing to investigating intercultural competence in practice through oral and written interactions in online exchanges", Kongress für Fremdsprachendidaktik der Deutschen Gesellschaft für Fremdsprachenforschung (DGFF-Tagung), Universität Leipzig, octobre 2009. [http://edutice.archives-ouvertes.fr/edutice-00548891/]
Abendroth-Timmer , D., Chanier, T., Ciekanski , M., Bechtel M. & Henning E-V. (2010) "Du développement à l’investigation de la compétence interculturelle en pratique à partir des interactions à l’oral et à l’écrit dans des échanges en ligne à distance." Colloque "Plurilingualism and Pluriculturalism in a Globalised World: which Pedagogy?" (PLIDAM), 17-19 Juin, Paris.
Laurent, M. (2011). Structuration des données des blogues de la formation Infral à l’aide des outils de programmation Python et NLTK. Report of Master 2 Sciences du Langage, Univertié Blaise Pascal
Chanier, T. & Ciekanski, M. (2010). Utilité du partage des corpus pour l'analyse des interactions en ligne en situation d'apprentissage : un exemple d'approche méthodologique autour d'une base de corpus d'apprentissage. ALSIC - Apprentissage des Langues et Systèmes d'Information et de Communication 13 [http://edutice.archives-ouvertes.fr/edutice-00486676/]
Reffay, C, Chanier, T., Noras, M. & Betbeder, M.-L. (2008). Contribution à la structuration de corpus d'apprentissage pour un meilleur partage en recherche. In Basque, J. & Reffay, C. (dir.), numéro spécial EPAL (échanger pour apprendre en ligne), Sciences et Technologies de l'Information et de la Communication pour l'Education et la Formation (STICEF), 15, [http://sticef.univ-lemans.fr/num/vol2008/01-reffay/sticef_2008_reffay_01p.pdf , http://edutice.archives-ouvertes.fr/edutice-00159733]
References (URI):http://edutice.archives-ouvertes.fr/edutice-00548891/
http://edutice.archives-ouvertes.fr/edutice-00486676/
http://sticef.univ-lemans.fr/num/vol2008/01-reffay/sticef_2008_reffay_01p.pdf
Requires:mce-infral-letec-all
Rights: Rights holders of this corpus are: Thierry Chanier ; Dagmar Abendroth-Timmer; Maud Ciekanski ; Mark Bechtel ; Laurent Mario ; licence = http://creativecommons.org/licenses/by-nc-sa/2.0/
Rights (URI):http://lrl-diffusion.univ-bpclermont.fr/mulce/metadata/vdex/mce_licence.xml
Subject:NLP; XML; telecollaboration ; intercultural; online teaching
French language
Subject (ISO639):fra
Subject (LCSH):Education - Data processing
Computer-assisted instruction
Language and languages - Study and teaching
Subject (OLAC):applied_linguistics
discourse_analysis
text_and_corpus_linguistics
Temporal Coverage:name=Infral course ; start=2008-09-29; end=2009-01-09
name=Master Project ; start=2011-03-01; end=2011-30-06
Type (DCMI):Dataset
Collection
Type (Discourse):dialogue
narrative
Type (OLAC):primary_text

OLAC Info

Archive:  Multimodal Learning and teaching Corpora Exchange
Description:  http://www.language-archives.org/archive/mulce.org
GetRecord:  OAI-PMH request for OLAC format
GetRecord:  Pre-generated XML file

OAI Info

OaiIdentifier:  oai:mulce.org:mce-infral-tagged_blogs
DateStamp:  2012-01-09
GetRecord:  OAI-PMH request for simple DC format

Search Info

Citation: Mario Laurent; Laurent Mario; BEGIN:VCARD FN:Mario Laurent N:Laurent;Mario ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France END:VCARD. 2011. BEGIN:VCARD FN:Mulce (MULtimodal Corpus Exchange) ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France URL:http://mulce.org END:VCARD.
Terms: area_Europe country_DE country_FR dcmi_Collection dcmi_Dataset iso639_deu iso639_fra olac_applied_linguistics olac_dialogue olac_discourse_analysis olac_narrative olac_primary_text olac_text_and_corpus_linguistics

Inferred Metadata

Country: France
Area: Europe


http://www.language-archives.org/item.php/oai:mulce.org:mce-infral-tagged_blogs
Up-to-date as of: Sun May 13 5:56:06 EDT 2012