Skip to Main content Skip to Navigation
Journal articles

À la croisée des langues. Annotation et fouille de corpus plurilingues

Abstract : In the frame of a research programme on the study of language c ontact phenomena and of their role in linguistic change, there currently is an eff ort to collect plurilingual corpora, exhibiting a great variety of contact phenomena on a sample o f languages of various genetical and typological background. This has implied developing a s pecific document processing software for digital corpora with internal plurilingualis m, in order to represent, store, annotate, and visualize their linguistic data, and to build data minin g tools. Existing encoding standards have been extended to cope with such phenomena as speech segm ents "floating" between languages, occurring in plurilingual talk. In this article , we describe the structure that has been defined for the plurilingual corpora, and the background defi nition of plurilingual linguistic units that is used for statistical analysis in the corpora.
Document type :
Journal articles
Complete list of metadata

Cited literature [33 references]  Display  Hide  Download
Contributor : Isabelle Léglise Connect in order to contact the contributor
Submitted on : Thursday, September 11, 2014 - 11:56:34 AM
Last modification on : Tuesday, November 16, 2021 - 4:53:17 AM
Long-term archiving on: : Friday, December 12, 2014 - 10:24:31 AM


Files produced by the author(s)


  • HAL Id : halshs-01063067, version 1


Pascal Vaillant, Isabelle Léglise. À la croisée des langues. Annotation et fouille de corpus plurilingues. Revue des Nouvelles Technologies de l'Information, Editions RNTI, 2014, RNTI-SHS-2, pp.81-100. ⟨halshs-01063067⟩



Record views


Files downloads