Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages - HAL Accéder directement au contenu
Communication dans un congrès Année : 2014

Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages

Résumé

Automatic speech processing technologies hold great potential to facilitate the urgent task of documenting the world's languages. The present research aims to explore the application of speech recognition tools to a little-documented language, with a view to facilitating processes of annotation, transcription and linguistic analysis. The target language is Yongning Na (a.k.a. Mosuo), an unwritten Sino-Tibetan language with less than 50,000 speakers. An acoustic model of Na was built using CMU Sphinx. In addition to this 'light' model, trained on a small data set (only 4 hours of speech from 1 speaker), 'heavyweight' models from five national languages (English, French, Chinese, Vietnamese and Khmer) were also applied to the same data. Preliminary results are reported, and perspectives for the long road ahead are outlined.
Fichier principal
Vignette du fichier
SLTU2014_Do_Michaud_Castelli_FINAL.pdf ( 325.17 Ko ) Télécharger
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

halshs-00980431, version 1 (18-04-2014)
halshs-00980431, version 2 (25-05-2014)

Identifiants

  • HAL Id : halshs-00980431 , version 2

Citer

Thi-Ngoc-Diep Do, Alexis Michaud, Eric Castelli. Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages. 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), May 2014, St Petersburg, Russia. pp.153-160. ⟨halshs-00980431v2⟩
3130 Consultations
352 Téléchargements
Dernière date de mise à jour le 20/04/2024
comment ces indicateurs sont-ils produits

Partager

Gmail Facebook Twitter LinkedIn Plus