Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages

Thi-Ngoc-Diep Do; Alexis Michaud; Eric Castelli

Communication Dans Un Congrès Année : 2014

Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages

(1) , (2) , (1)

1
2

Thi-Ngoc-Diep Do

Fonction : Auteur correspondant
PersonId : 955335

Connectez-vous pour contacter l'auteur

Speech Communication

Alexis Michaud

Fonction : Auteur
PersonId : 419
IdHAL : alexis-michaud
ORCID : 0000-0003-1165-2680
IdRef : 095131507

Speech Communication

Eric Castelli

Fonction : Auteur
PersonId : 750232
IdHAL : eric-castelli
ORCID : 0000-0003-2978-2619
IdRef : 068256256

Speech Communication

Résumé

Automatic speech processing technologies hold great potential to facilitate the urgent task of documenting the world's languages. The present research aims to explore the application of speech recognition tools to a little-documented language, with a view to facilitating processes of annotation, transcription and linguistic analysis. The target language is Yongning Na (a.k.a. Mosuo), an unwritten Sino-Tibetan language with less than 50,000 speakers. An acoustic model of Na was built using CMU Sphinx. In addition to this 'light' model, trained on a small data set (only 4 hours of speech from 1 speaker), 'heavyweight' models from five national languages (English, French, Chinese, Vietnamese and Khmer) were also applied to the same data. Preliminary results are reported, and perspectives for the long road ahead are outlined.

Mots clés

Acoustic models automatic speech recognition (ASR) multilingual modelling under-resourced languages endangered languages Yongning Na Naish languages language portability statistical language modeling crosslingual acoustic modelling and adaptation

Domaines

Linguistique Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

SLTU2014_Do_Michaud_Castelli_FINAL.pdf (325.18 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alexis Michaud : Connectez-vous pour contacter le contributeur

https://shs.hal.science/halshs-00980431

Soumis le : dimanche 25 mai 2014-16:04:32

Dernière modification le : jeudi 4 avril 2024-21:13:44

Archivage à long terme le : mardi 11 avril 2017-01:38:18

Dates et versions

halshs-00980431 , version 1 (18-04-2014)

halshs-00980431 , version 2 (25-05-2014)

Identifiants

HAL Id : halshs-00980431 , version 2

Citer

Thi-Ngoc-Diep Do, Alexis Michaud, Eric Castelli. Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages. 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), May 2014, St Petersburg, Russia. pp.153-160. ⟨halshs-00980431v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS GENCI ASIES_ET_PACIFIQUE ANR

3130 Consultations

352 Téléchargements

Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager