Evaluating phonemic transcription of low-resource tonal languages for language documentation - HAL-SHS - Sciences de l'Homme et de la Société Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Evaluating phonemic transcription of low-resource tonal languages for language documentation

Résumé

Transcribing speech is an important part of language documentation, yet speech recognition technology has not been widely harnessed to aid linguists. We explore the use of a neural network architecture with the connectionist temporal classification loss function for phonemic and tonal transcription in a language documentation setting. In this framework, we explore jointly modelling phonemes and tones versus modelling them separately, and assess the importance of pitch information versus phonemic context for tonal prediction. Experiments on two tonal languages, Yongning Na and Eastern Chatino, show the changes in recognition performance as training data is scaled from 10 minutes up to 50 minutes for Chatino, and up to 224 minutes for Na. We discuss the findings from incorporating this technology into the linguistic workflow for documenting Yongning Na, which show the method's promise in improving efficiency, minimizing typographical errors, and maintaining the transcription's faithfulness to the acoustic signal, while highlighting phonetic and phonemic facts for linguistic consideration.
Fichier principal
Vignette du fichier
Adams_et_al2018_LREC.pdf (796.04 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Licence : CC BY NC SA - Paternité - Pas d'utilisation commerciale - Partage selon les Conditions Initiales
Loading...

Dates et versions

halshs-01709648 , version 1 (15-02-2018)
halshs-01709648 , version 2 (20-02-2018)
halshs-01709648 , version 3 (25-02-2018)
halshs-01709648 , version 4 (05-03-2018)

Licence

Paternité - Pas d'utilisation commerciale - Partage selon les Conditions Initiales

Identifiants

  • HAL Id : halshs-01709648 , version 4

Citer

Oliver Adams, Trevor Cohn, Graham Neubig, Hilaria Cruz, Steven Bird, et al.. Evaluating phonemic transcription of low-resource tonal languages for language documentation. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), May 2018, Miyazaki, Japan. pp.3356-3365. ⟨halshs-01709648v4⟩
1455 Consultations
2040 Téléchargements

Partager

Gmail Facebook X LinkedIn More