Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages

Thi-Ngoc-Diep Do 1, * Alexis Michaud 2 Eric Castelli 1
* Auteur correspondant
1 Speech Communication
MICA - International Research Institute MICA
2 Speech Communication
MICA - International Research Institute MICA
Abstract : Automatic speech processing technologies hold great potential to facilitate the urgent task of documenting the world's languages. The present research aims to explore the application of speech recognition tools to a little-documented language, with a view to facilitating processes of annotation, transcription and linguistic analysis. The target language is Yongning Na (a.k.a. Mosuo), an unwritten Sino-Tibetan language with less than 50,000 speakers. An acoustic model of Na was built using CMU Sphinx. In addition to this 'light' model, trained on a small data set (only 4 hours of speech from 1 speaker), 'heavyweight' models from five national languages (English, French, Chinese, Vietnamese and Khmer) were also applied to the same data. Preliminary results are reported, and perspectives for the long road ahead are outlined.
Type de document :
Communication dans un congrès
4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), May 2014, St Petersburg, Russia. pp.153-160, 2014
Liste complète des métadonnées

Littérature citée [33 références]  Voir  Masquer  Télécharger

https://halshs.archives-ouvertes.fr/halshs-00980431
Contributeur : Alexis Michaud <>
Soumis le : dimanche 25 mai 2014 - 16:04:32
Dernière modification le : mardi 27 mars 2018 - 14:16:03
Document(s) archivé(s) le : mardi 11 avril 2017 - 01:38:18

Fichier

SLTU2014_Do_Michaud_Castelli_F...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : halshs-00980431, version 2

Collections

Citation

Thi-Ngoc-Diep Do, Alexis Michaud, Eric Castelli. Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages. 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), May 2014, St Petersburg, Russia. pp.153-160, 2014. 〈halshs-00980431v2〉

Partager

Métriques

Consultations de la notice

27490

Téléchargements de fichiers

227