?. and H. ??, ?????"le+V+se"?????????-???????, 2016.

. ??,

O. Adams, Automatic understanding of unwritten languages. Melbourne: The University of Melbourne, 2017.

O. Adams, T. Cohn, G. Neubig, and H. Cruz, Evaluating phonemic transcription of low-resource tonal languages for language documentation, Proceedings of the 11th Language Resources and Evaluation Conference (LREC 2018, pp.3356-3365, 2018.
URL : https://hal.archives-ouvertes.fr/halshs-01709648

O. Adams, A. Makarucha, G. Neubig, and S. Bird-&-trevor-cohn, Cross-lingual word embeddings for low-resource language modeling, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol.1, pp.937-947, 2017.

, ), a sort of "cooperative" which was successfully set up within a few years and currently operates with only two permanent staff, 13Possible business models include collaboration between several institutions and funding agencies, or crowdfunding: joint support from a great number of institutions (typically, universities), following the model of the publishing house Language Science Press

O. Adams, G. Neubig, T. Cohn-&-steven, and . Bird, Inducing bilingual lexicons from small quantities of sentence-aligned phonemic transcriptions, Proceedings of the International Workshop on Spoken Language Translation (IWSLT 2015). Da Nang, 2015.

G. Adda, S. Stüker, M. Adda-decker, O. Ambouroue, L. Besacier et al., Breaking the unwritten language barrier: The BULB Project, Procedia Computer Science, vol.81, pp.8-14, 2016.
URL : https://hal.archives-ouvertes.fr/halshs-01428027

M. Adda-decker, De la reconnaissance automatique de la parole à l'analyse linguistique de corpus oraux, Actes des XXVIe Journées d'Etude de la Parole, pp.389-400, 2006.

M. Bacchiani and . Brian-roark, Unsupervised language model adaptation, Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.224-227, 2003.

L. Besacier and E. Barnard, Automatic speech recognition for under-resourced languages: A survey, Speech Communication, vol.56, pp.85-100, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00953644

S. Bird and . Chiang, Machine translation for language preservation, The COLING 2012 Organizing Committee, pp.125-134, 2012.

. Bird, F. R. Steven, O. Hanke, . Adams-&-haejoong, and . Lee, Aikuma: A mobile app for collaborative language documentation, Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages, pp.1-5, 2014.

D. Blachon, E. Gauthier, L. Besacier, G. Kouarata, and M. Addadecker-&-annie-rialland, Parallel speech collection for under-resourced language studies using the LIG-AIKUMA mobile device app, Procedia Computer Science, vol.81, pp.61-66, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01350065

J. Blevins, Endangered sound patterns: Three perspectives on theory and description, Language Documentation & Conservation, vol.1, issue.1, pp.1-16, 2007.

R. Blokland, M. Fedina, C. Gerstenberger, N. Partanen, and M. Wilbur, Language documentation meets language technology, Proceedings of the First International Workshop on Computational Linguistics for Uralic Languages, pp.8-18, 2015.

R. Bonnet, C. Buret, A. François, B. Galliot, S. Guillaume et al., Vers des ressources électroniques interconnectées: Lexica, les dictionnaires de la collection Pangloss, pp.48-51, 2017.

L. Bouquiaux and . Thomas, Enquête et description des langues à tradition orale. Volume I: l'enquête de terrain et l'analyse grammaticale, Société d'Études Linguistiques et Anthropologiques de France, 1971.

D. Bourcier and . Dulong-de-rosnay, , 2004.

M. Brunelle and D. Chow-&-th?y-nhã-uyên-nguy?n, Effects of lexical frequency and lexical category on the duration of Vietnamese syllables, Proceedings of ICPhS XVIII, 2015.

M. ?avar, D. Cavar-&-hilaria, and . Cruz, Endangered language documentation: Bootstrapping a Chatino speech corpus, forced aligner, ASR, Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016, pp.4004-4011, 2016.

S. Collins, N. Harrower, D. Haug, B. Immenhauser, G. Lauer et al., Laurent Romary & Eveline Wandl-Vogt. 2015. Going digital: Creating change in the Humanities

E. Cruz, Phonology, tone and the functions of tone in San Juan Quiahije Chatino. Austin: University of Texas at Austin. Doctoral dissertation, 2011.

E. Cruz and . Woodbury, Using automatic alignment to analyze endangered language data: Testing the viability of untrained alignment, Las Memorias del Congreso de Idiomas Indígenas de Latinoamérica-II, vol.134, pp.2235-2246, 2006.

R. M. Dixon, Field linguistics: A minor manual, Sprachtypologie und Universalienforschung, vol.60, issue.1, pp.12-31, 2007.

T. N. Do, E. Diep, . Castelli-&-laurent, and . Besacier, Mining parallel data from comparable corpora via triangulation, Proceedings of the International Conference on Asian Language Processing, pp.185-188, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00959145

T. N. Do, A. Diep, . Michaud-&-eric, and . Castelli, Towards the automatic processing of Yongning Na (Sino-Tibetan): Developing a "light" acoustic model of the target language and testing "heavyweight" models from five national languages, Proceedings of the 4th International Workshop on Spoken Language Technologies for Under-resourced Languages, pp.153-160, 2014.
URL : https://hal.archives-ouvertes.fr/halshs-00980431

R. Dobbs and . La, The two-level tonal system of Lataddi Narua, Linguistics of the Tibeto-Burman Area, vol.39, issue.1, pp.67-104, 2016.

P. Fabre, Un procédé électrique percutané d'inscription de l'accolement glottique au cours de la phonation: Glottographie de haute fréquence, pp.66-69, 1957.

G. Fant, Acoustic theory of speech production, with calculations based on X-ray studies of Russian articulations, 1960.

M. Ferlus, Formation des registres et mutations consonantiques dans les langues mon-khmer, Mon-Khmer Studies, vol.8, pp.1-76, 1979.

C. J. Fillmore, Corpus linguistics or computer-aided armchair linguistics, Directions in corpus linguistics: Proceedings of Nobel Symposium, vol.82, pp.35-60, 1992.

A. Fourcin, First applications of a new laryngograph, Medical and Biological Illustration, vol.21, pp.172-182, 1971.

J. Gao, Interdependence between tones, segments and phonation types in Shanghai Chinese, 2015.

. Garcia-romero, D. Daniel, G. Snyder, and . Sell, Speaker diarization using deep neural network embeddings, Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4930-4934, 2017.

L. Georgeton and . Fougeron, Domain-initial strengthening on French vowels and phonological contrasts: Evidence from lip articulation and spectral variation, Journal of Phonetics, vol.46, pp.128-146, 2014.
URL : https://hal.archives-ouvertes.fr/halshs-01402718

J. Goldman, EasyAlign: An automatic phonetic alignment tool under Praat, Proceedings of the 12th Annual Conference of the International Speech Communication Association, pp.3233-3236, 2011.

A. Graves and A. Hinton, Speech recognition with deep recurrent neural networks, Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.6645-6649, 2013.
DOI : 10.1109/icassp.2013.6638947
URL : http://learning.cs.toronto.edu/~hinton/absps/RNN13.pdf

S. Green, J. Heer, D. Christopher, and . Manning, The efficacy of human post-editing for language translation, Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp.439-448, 2013.

R. Grusin, The dark side of Digital Humanities: Dispatches from two recent MLA conventions, Differences, vol.25, issue.1, pp.79-92, 2014.

D. Helbing, B. S. Frey, G. Gigerenzer, E. Hafen, M. Hagner et al., Jeroen van den Hoven, Roberto V. Zicari & Andrej Zwitter. 2017. Will democracy survive Big Data and Artificial Intelligence?, Scientific American

T. Hirata-edds and . Herrick, Building tone resources for second language learners from phonetic documentation: Cherokee examples. Language Documentation & Conservation 11, pp.289-304, 2017.

. Huáng and ?. Bùfán, Journal of Sino-Tibetan Linguistics ????? 3, pp.30-55, 2009.

K. Johnson, Decisions and mechanisms in exemplar-based phonology, Experimental approaches to phonology, pp.25-40, 2007.

L. M. Johnson and M. Bell, Forced alignment for understudied language varieties: Testing Prosodylab-Aligner with Tongan data. Language Documentation & Conservation 12, pp.80-123, 2018.

A. Katsamanis, M. Black, P. Georgiou, L. Goldstein, and &. S. Narayanan, SailAlign: Robust long speech-text alignment, Proceedings of the Workshop on New Tools and Methods for Very-Large Scale Phonetics Research, 2011.

K. Kinoshita, M. Delcroix, S. Gannot, A. P. Emanuël, R. Habets et al., A summary of the REVERB challenge: State-of-theart and remaining challenges in reverberant speech processing research, EURASIP Journal on Advances in Signal Processing, vol.2016, issue.1, 2016.

M. Kurimo, S. Enarvi, O. Tilk, and M. Varjokallio, Modeling under-resourced languages for speech recognition, Language Resources and Evaluation, vol.51, issue.4, pp.961-987, 2017.
DOI : 10.1007/s10579-016-9336-9

P. Lamere, P. Kwok, W. Walker, E. B. Gouvêa, and R. Singh, Design of the CMU sphinx-4 decoder, Proceedings of the 8th European Conference on Speech Communication and Technology, 2003.

M. Lewis, . Paul, and F. Gary, Languages of China: An Ethnologue country report 19th edition. Dallas: SIL International, 2016.

M. Liberman, The first DIHARD speech diarization challenge, 2018.

L. Lidz, A descriptive grammar of Yongning Na (Mosuo), 2010.

L. Lidz and . Michaud, Yongning Na (Mosuo): Language documentation in the Sino-Tibetan borderland, Presented at the International Conference on Sino-Tibetan Languages and Linguistics, pp.403-439, 1990.

Y. Ma and . Bao, ??????????????? (Overlapping speech detection using high-level information features), Science and Technology), vol.57, issue.1, pp.79-83, 2017.

B. Michailovsky, M. Mazaudon, A. Michaud, and S. Guillaume, Documenting and researching endangered languages: The Pangloss Collection. Language Documentation & Conservation 8, pp.119-135, 2014.
URL : https://hal.archives-ouvertes.fr/halshs-01003734

A. Michaud, , 2015.

A. Michaud, Tone in Yongning Na: Lexical tones and morphotonology, Studies in Diversity Linguistics 13), 2017.
URL : https://hal.archives-ouvertes.fr/halshs-01094049

A. Michaud, Speech recognition for newly documented languages: Highly encouraging tests using automatically generated phonemic transcription of Yongning Na audio recordings. HimalCo-Himalayan Corpora, 2017.

A. Michaud, A. Hardie, S. Guillaume-&-martine, and . Toda, Combining documentation and research: Ongoing work on an endangered language, Proceedings of IALP 2012 (2012 International Conference on Asian Language Processing, pp.169-172, 2012.
DOI : 10.1109/ialp.2012.32
URL : https://hal.archives-ouvertes.fr/halshs-00731261

A. Michaud and . Jacques, The phonology of Laze: Phonemic analysis, syllabic inventory, and a short word list, Yuyanxue Luncong ?????, vol.45, pp.196-230, 2012.
URL : https://hal.archives-ouvertes.fr/halshs-00582639

A. Michaud and . Vaissière, Tone and intonation: Introductory notes and practical recommendations. KALIPHO-Kieler Arbeiten zur Linguistik und Phonetik 3, pp.43-80, 2015.
URL : https://hal.archives-ouvertes.fr/halshs-01091477

A. Miller and . Elsner, Click reduction in fluent speech: A semiautomated analysis of Mangetti Dune !Xung, Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages, pp.107-115, 2017.

G. Montavon, W. Samek, and K. Müller, Methods for interpreting and understanding deep neural networks, Digital Signal Processing, vol.73, pp.1-15, 2017.
DOI : 10.1016/j.dsp.2017.10.011
URL : https://doi.org/10.1016/j.dsp.2017.10.011

G. Neubig, M. Mimura, and S. Mori-&-tatsuya-kawahara, Learning a language model from continuous speech, Proceedings of the Eleventh Annual Conference of the International Speech Communication Association (Interspeech 2010, pp.1053-1056, 2010.

P. Newman and . Ratliff, Linguistic fieldwork, 2001.

. Niebuhr, J. Oliver-&-klaus, and . Kohler, Perception of phonetic detail in the identification of highly reduced words, Journal of Phonetics, vol.39, issue.3, pp.319-329, 2011.

O. Niebuhr and . Michaud, Speech data acquisition: The underestimated challenge. KALIPHO-Kieler Arbeiten zur Linguistik und Phonetik 3, pp.1-42, 2015.
URL : https://hal.archives-ouvertes.fr/halshs-01026295

B. Remijsen and G. Otto, Ayoker & Timothy Mills, Shilluk. Journal of the International Phonetic Association, vol.41, issue.1, pp.111-125, 2011.

M. U. Scherer, Regulating artificial intelligence systems: Risks, challenges, competencies, and strategies, Harvard Journal of Law & Technology, vol.29, issue.2, 2016.
DOI : 10.2139/ssrn.2609777

T. Schultz and &. A. Waibel, Language-independent and language-adaptive acoustic modeling for speech recognition, Speech Communication, vol.35, pp.31-51, 2001.
DOI : 10.1016/s0167-6393(00)00094-7
URL : http://www.ri.cmu.edu/pub_files/pub3/schultz_tanja_2001_3/schultz_tanja_2001_3.pdf

M. L. Seltzer, D. Yu-&-yongqiang, and . Wang, An investigation of deep neural networks for noise robust speech recognition, Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.7398-7402, 2013.
DOI : 10.1109/icassp.2013.6639100

R. Smith and . Hawkins, Production and perception of speaker-specific phonetic detail at word boundaries, Journal of Phonetics, vol.40, issue.2, pp.213-233, 2012.
DOI : 10.1016/j.wocn.2011.11.003
URL : http://eprints.gla.ac.uk/45862/1/45862.pdf

L. Souag, Linguistic fieldwork: A practical guide. Language Documentation & Conservation 5. 66-68, 2008.

M. Sperber, G. Neubig, and C. Fügen, Efficient speech transcription through respeaking, Proceedings of Interspeech 2013, pp.1087-1091, 2013.

M. Sperber and G. Neubig, Satoshi Nakamura & Alex Waibel, Speech Communication, 2017.

F. Stahlberg, T. Schlippe, S. Vogel-&-tanja, and . Schultz, Towards automatic speech recognition without pronunciation dictionary, transcribed speech and text resources in the target language using cross-lingual word-to-phoneme alignment, Proceedings of the International Workshop on Spoken Language Technologies for Under-Resourced Languages, 2014.
DOI : 10.1016/j.csl.2014.10.001

K. Stevens, Acoustic phonetics, 1998.
DOI : 10.1121/1.1327577

J. Strunk, F. Schiel-&-frank, and . Seifart, Untrained forced alignment of transcriptions and audio for language documentation corpora using WebMAUS, Proceedings of the Ninth International Conference on Language Resources and Evaluation, pp.3940-3947, 2014.

N. Thieberger, Language Documentation & Conservation 11, 2017.

N. Thieberger, A. Margetts, and S. Morey-&-simon-musgrave, Assessing annotated corpora as research output, Australian Journal of Linguistics, vol.36, issue.1, pp.1-21, 2016.
DOI : 10.1080/07268602.2016.1109428

N. Thieberger and . Nordlinger, Doing great things with small languages (Australian Research Council grant DP0984419), 2006.

J. Vaissière, On the acoustic and perceptual characterization of reference vowels in a cross-language perspective, Proceedings of ICPhS XVII, 2011.

J. Vaissière, Proposals for a representation of sounds based on their main acoustico-perceptual properties, Tones and features, pp.306-330, 2011.

Z. Wang, T. Schultz, and &. A. Waibel, Comparison of acoustic model adaptation techniques on non-native speech, Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.540-543, 2003.

C. Wasson, G. S. Holton-&-heather, and . Roth, Bringing user-centered design to the field of language archives. Language Documentation & Conservation 10, pp.641-681, 2016.

M. D. Wilkinson, M. Dumontier, J. Ijsbrand, G. Aalbersberg, M. Appleton et al., The FAIR Guiding Principles for scientific data management and stewardship, 2016.
DOI : 10.1038/sdata.2016.18
URL : http://www.nature.com/articles/sdata201618.pdf

B. Winter, The other N: The role of repetitions and items in the design of phonetic experiments, Proceedings of the 18th International Congress of Phonetic Sciences, 2015.

T. Woodbury, Defining documentary linguistics, Language documentation and description, vol.1, pp.35-51, 2003.

A. Michaud-michaud,

O. Adams and O. ,

T. Anthony-cohn-trevor,

, Graham Neubig gneubig@cs.cmu.edu Séverine Guillaume severine