![]() |
OLAC Record oai:mulce.org:mce-infral-tagged_blogs |
| Metadata | ||
| Title: | Blog's data from Infral structured, tokenized and tagged into a XML file. | |
| Access Rights: | open access after registration | |
| Audience: | Researchers or teachers in educational sciences or linguistics | |
| Conforms To: | IMS-CP for packaging | |
| Contributor (compiler): | Chanier Thierry | |
| Contributor (data_inputter): | Laurent Mario | |
| Contributor (depositor): | Chanier Thierry | |
| Contributor (editor): | Chanier Thierry | |
| Contributor (researcher): | Laurent Mario ; Chanier Thierry | |
| Creator: | BEGIN:VCARD FN:Mario Laurent N:Laurent;Mario ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France END:VCARD | |
| Creator (URI): | Mario Laurent | |
| Creator (compiler): | Laurent Mario | |
| Date Created (W3CDTF): | 2011-12-02 | |
| Description: | This corpus is based on data extracted from the global Learning & Teaching Corpus Infral.(on LETEC, see http://lrl-diffusion.univ-bpclermont.fr/mulce/metadata/mce_LETECorpus-en.pdf ). Structuring language interactions into exploitable corpora is necessary to analyze the data from the Infral project. To understand the development of intercultural competences we have to quantify the production of the different participants, such as language use or lexical diversity. In order to achieve this, we used Python programming language and the NLTK library. During the Infral course, participants from a French and a German university communicated using both languages via blogs. We developed a program that converts plain text from Infral's blogs into a structured XML file where each message is tokenized into words. Each word is tagged according to its form and its original language. | |
| Extent: | 28 500 ko | |
| Format (IMT): | text/xml | |
| application/pdf | ||
| Identifier: | mce-infral-tagged_blogs | |
| Identifier (URI): | http://mulce.univ-bpclermont.fr:8080/PlateFormeMulce/VIEW/PUBLIC/03/VMeta.do?adr=Infral%2FCorpus_objets%2Fmce-infral-tagged_blogs | |
| Language: | French | |
| German | ||
| Language (ISO639): | fra | |
| deu | ||
| Publisher: | BEGIN:VCARD FN:Mulce (MULtimodal Corpus Exchange) ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France URL:http://mulce.org END:VCARD | |
| References: | Abendroth-Timmer, D., Bechtel, M., Chanier, T. & Ciekanski, M. (2009) "From developing to investigating intercultural competence in practice through oral and written interactions in online exchanges", Kongress für Fremdsprachendidaktik der Deutschen Gesellschaft für Fremdsprachenforschung (DGFF-Tagung), Universität Leipzig, octobre 2009. [http://edutice.archives-ouvertes.fr/edutice-00548891/] | |
| Abendroth-Timmer , D., Chanier, T., Ciekanski , M., Bechtel M. & Henning E-V. (2010) "Du développement à l’investigation de la compétence interculturelle en pratique à partir des interactions à l’oral et à l’écrit dans des échanges en ligne à distance." Colloque "Plurilingualism and Pluriculturalism in a Globalised World: which Pedagogy?" (PLIDAM), 17-19 Juin, Paris. | ||
| Laurent, M. (2011). Structuration des données des blogues de la formation Infral à l’aide des outils de programmation Python et NLTK. Report of Master 2 Sciences du Langage, Univertié Blaise Pascal | ||
| Chanier, T. & Ciekanski, M. (2010). Utilité du partage des corpus pour l'analyse des interactions en ligne en situation d'apprentissage : un exemple d'approche méthodologique autour d'une base de corpus d'apprentissage. ALSIC - Apprentissage des Langues et Systèmes d'Information et de Communication 13 [http://edutice.archives-ouvertes.fr/edutice-00486676/] | ||
| Reffay, C, Chanier, T., Noras, M. & Betbeder, M.-L. (2008). Contribution à la structuration de corpus d'apprentissage pour un meilleur partage en recherche. In Basque, J. & Reffay, C. (dir.), numéro spécial EPAL (échanger pour apprendre en ligne), Sciences et Technologies de l'Information et de la Communication pour l'Education et la Formation (STICEF), 15, [http://sticef.univ-lemans.fr/num/vol2008/01-reffay/sticef_2008_reffay_01p.pdf , http://edutice.archives-ouvertes.fr/edutice-00159733] | ||
| References (URI): | http://edutice.archives-ouvertes.fr/edutice-00548891/ | |
| http://edutice.archives-ouvertes.fr/edutice-00486676/ | ||
| http://sticef.univ-lemans.fr/num/vol2008/01-reffay/sticef_2008_reffay_01p.pdf | ||
| Requires: | mce-infral-letec-all | |
| Rights: | Rights holders of this corpus are: Thierry Chanier ; Dagmar Abendroth-Timmer; Maud Ciekanski ; Mark Bechtel ; Laurent Mario ; licence = http://creativecommons.org/licenses/by-nc-sa/2.0/ | |
| Rights (URI): | http://lrl-diffusion.univ-bpclermont.fr/mulce/metadata/vdex/mce_licence.xml | |
| Subject: | NLP; XML; telecollaboration ; intercultural; online teaching | |
| French language | ||
| Subject (ISO639): | fra | |
| Subject (LCSH): | Education - Data processing | |
| Computer-assisted instruction | ||
| Language and languages - Study and teaching | ||
| Subject (OLAC): | applied_linguistics | |
| discourse_analysis | ||
| text_and_corpus_linguistics | ||
| Temporal Coverage: | name=Infral course ; start=2008-09-29; end=2009-01-09 | |
| name=Master Project ; start=2011-03-01; end=2011-30-06 | ||
| Type (DCMI): | Dataset | |
| Collection | ||
| Type (Discourse): | dialogue | |
| narrative | ||
| Type (OLAC): | primary_text | |
OLAC Info |
||
| Archive: | Multimodal Learning and teaching Corpora Exchange | |
| Description: | http://www.language-archives.org/archive/mulce.org | |
| GetRecord: | OAI-PMH request for OLAC format | |
| GetRecord: | Pre-generated XML file | |
OAI Info |
||
| OaiIdentifier: | oai:mulce.org:mce-infral-tagged_blogs | |
| DateStamp: | 2012-01-09 | |
| GetRecord: | OAI-PMH request for simple DC format | |
Search Info | ||
| Citation: | Mario Laurent; Laurent Mario; BEGIN:VCARD FN:Mario Laurent N:Laurent;Mario ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France END:VCARD. 2011. BEGIN:VCARD FN:Mulce (MULtimodal Corpus Exchange) ORG: Universite Blaise Pascal ADR: Clermont-Ferrand;France URL:http://mulce.org END:VCARD. | |
| Terms: | area_Europe country_DE country_FR dcmi_Collection dcmi_Dataset iso639_deu iso639_fra olac_applied_linguistics olac_dialogue olac_discourse_analysis olac_narrative olac_primary_text olac_text_and_corpus_linguistics | |
Inferred Metadata | ||
| Country: | France | |
| Area: | Europe | |