HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Les modèles pré-entraînés à l'épreuve des langues rares : expériences de reconnaissance de mots sur la langue japhug (sino-tibétain)

Abstract : We describe in this work the latest results obtained in interdisciplinary work to support "fundamental language documentation" through the use of speech recognition tools. Specifically, the focus is on the development of a speech recognition system for Japhug, an endangered minority language of China. The practical goal is to reduce the transcription workload of field linguists. We show how a new deep learning approach based on the language-specific tuning of a generic pre-trained representation model, XLS-R, using a Transformer architecture, significantly improves the quality of phonemic transcription, in a setting where only a few hours of annotated data are available. Most significantly, this method allows for reaching the stage of automatic word recognition. Nevertheless, we note difficulties in implementation, in terms of learning stability. The question of the evaluation of the tool by field linguists is also addressed.
Document type :
Conference papers
Complete list of metadata

https://halshs.archives-ouvertes.fr/halshs-03625580
Contributor : Alexis Michaud Connect in order to contact the contributor
Submitted on : Thursday, March 31, 2022 - 5:42:54 AM
Last modification on : Thursday, May 5, 2022 - 11:57:57 AM

File

JEP2022_Transformers_Japhug.pd...
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution - NonCommercial - ShareAlike 4.0 International License

Identifiers

  • HAL Id : halshs-03625580, version 1

Citation

Séverine Guillaume, Guillaume Wisniewski, Cécile Macaire, Guillaume Jacques, Alexis Michaud, et al.. Les modèles pré-entraînés à l'épreuve des langues rares : expériences de reconnaissance de mots sur la langue japhug (sino-tibétain). 34e Journées d’Études sur la Parole (JEP2022), Jun 2022, Noirmoutier, France. ⟨halshs-03625580⟩

Share

Metrics

Record views

33

Files downloads

13