Skip to Main content Skip to Navigation
Journal articles

À la croisée des langues. Annotation et fouille de corpus plurilingues

Abstract : In the frame of a research programme on the study of language c ontact phenomena and of their role in linguistic change, there currently is an eff ort to collect plurilingual corpora, exhibiting a great variety of contact phenomena on a sample o f languages of various genetical and typological background. This has implied developing a s pecific document processing software for digital corpora with internal plurilingualis m, in order to represent, store, annotate, and visualize their linguistic data, and to build data minin g tools. Existing encoding standards have been extended to cope with such phenomena as speech segm ents "floating" between languages, occurring in plurilingual talk. In this article , we describe the structure that has been defined for the plurilingual corpora, and the background defi nition of plurilingual linguistic units that is used for statistical analysis in the corpora.
Document type :
Journal articles
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download
Contributor : Isabelle Léglise <>
Submitted on : Thursday, September 11, 2014 - 11:56:34 AM
Last modification on : Saturday, February 15, 2020 - 1:44:56 AM
Long-term archiving on: : Friday, December 12, 2014 - 10:24:31 AM


Files produced by the author(s)


  • HAL Id : halshs-01063067, version 1


Pascal Vaillant, Isabelle Léglise. À la croisée des langues. Annotation et fouille de corpus plurilingues. Revue des Nouvelles Technologies de l'Information, Hermann, 2014, RNTI-SHS-2, pp.81-100. ⟨halshs-01063067⟩



Record views


Files downloads