A methodological framework to reveal and account for heterogeneity in multilingual and multi-stylistic annotated corpora - HAL Accéder directement au contenu
Communication dans un congrès Année : 2013

A methodological framework to reveal and account for heterogeneity in multilingual and multi-stylistic annotated corpora

Résumé

Although we know multilingualism, linguistic profusion - and even linguistic bricolage - are far from being uncommon, linguistic diversity is not gaining visibility in corpus linguistics nor in contact linguistics. Heterogeneous and hard to classify, linguistic practices are due to, and at the same time reveal, what Vertovec (2007) or Blommaert and Rampton (2011) call superdiversity. Research on language contact tends to examine isolated elicited sample sentences and describes contact-induced language changes (Thomason 2001; Heine & Kuteva 2005). Research on codeswitching, focusing either on its linguistic structure (Myers-Scotton 2002) or on its social meaning (Auer 1998), deals with synchronic spontaneous speech recordings but mostly assumes bounded languages or repertoires (matrix language, code-alternation etc.). One of the goals of the CLAPOTY project is to reveal and explain linguistic heterogeneity in spontaneous speech recordings. As we associate different research traditions (contact linguistics, anthropological linguistics, pragmatics, sociolinguistics and corpus linguistics), the project offers to integrate different perspectives in the explanations of specific forms that appear. When annotating data, heterogeneity is obvious. To date, we collected and annotated corpora involving linguistic forms attributable to 40 typologically diverse languages/varieties produced by 290 ordinary speakers (meaning plurilingual and plurilectal speakers using a wide range of linguistic resources ) in various everyday life situations around the world (Mexico, Taiwan, Senegal, the Balkans, French Guiana and Antilles among others). Our recordings involve multilingual turns and multi-participant interactions in which different styles may occur. In this talk, I will discuss the careful methodology we set up to analyze the various phenomena we may find in such heterogeneous corpora (contact-induced language variation and change, codeswitching, code-mixing, nonstandard practices, styles of doing being multilingual, interlanguage, crossing etc.) and discuss their categorization. I will explain how we treat a) the linguistic resources and b) the specific context for communication of each interaction
Loading...
Fichier non déposé

Dates et versions

halshs-00924925, version 1 (07-01-2014)

Identifiants

  • HAL Id : halshs-00924925 , version 1

Citer

Isabelle Léglise. A methodological framework to reveal and account for heterogeneity in multilingual and multi-stylistic annotated corpora. International conference on Superdiversity, Jun 2013, Jÿvaskÿla, Finland. ⟨halshs-00924925⟩
59 Consultations
0 Téléchargements
Dernière date de mise à jour le 26/05/2024
comment ces indicateurs sont-ils produits

Partager

Gmail Facebook Twitter LinkedIn Plus