HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Automatic Extraction of Verb Paradigms in Regional Languages: the case of the Linguistic Crescent varieties

Abstract : An important and costly step in the process of language documentation is the transcription (total or partial transcripts) of speech data collected in the field. Several projects adopt a methodology involving the use of speech transcription systems (Adda et al. 2016; Michaud et al. 2018); in such an approach, it is necessary to adapt the systems so that they can transcribe (at least phonetically) speech collected during fieldwork. However, within the data gathered, some have either an approximate transcription (e.g. in the case of reading), or more or less precise information on its content, for example in the case of verb conjugations: the linguist proposes a verb, and the informant must give all the possible inflections, most often in a fixed order for tenses and persons. The question addressed in this paper is to explore whether it is possible to use a transcription system developed for a given language (here French) without precise adaptation of acoustic models, in order to produce both segmentation and transcription of verbal paradigms of a closely related language (here several Romance varieties spoken in central France), and the conditions under which the system will or will not require post-processing.
Document type :
Conference papers
Complete list of metadata

Cited literature [12 references]  Display  Hide  Download

Contributor : Nicolas Quint Connect in order to contact the contributor
Submitted on : Monday, May 18, 2020 - 11:28:43 AM
Last modification on : Wednesday, March 16, 2022 - 3:50:39 AM


Explicit agreement for this submission


  • HAL Id : halshs-02508210, version 1


Elena Knyazeva, Gilles Adda, Philippe Boula de Mareüil, Maximilien Guérin, Nicolas Quint. Automatic Extraction of Verb Paradigms in Regional Languages: the case of the Linguistic Crescent varieties. STLU (Spoken Language Technologies for Under-resourced languages), European Language Resources Association (ELRA), Jan 2020, Marseille, France. pp.245-249. ⟨halshs-02508210⟩



Record views


Files downloads