E. M. Bender and B. Friedman, Data statements for natural language processing: Toward mitigating system bias and enabling better science, Transactions of the Association for Computational Linguistics, vol.6, pp.587-604, 2018.

T. Bolukbasi, K. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai, Man is to computer programmer as woman is to homemaker? Debiasing word embeddings, Proceedings of the 30 th Conference on Neural Information Processing Systems, NIPS 2016, pp.4349-4357, 2016.

J. Buolamwini and T. Gebru, Gender shades: Intersectional accuracy disparities in commercial gender classification, Proceedings of the Conference on Fairness, pp.77-91, 2018.

A. Caliskan, J. J. Bryson, and A. Narayanan, Semantics derived automatically from language corpora contain human-like biases, Science, vol.356, issue.6334, pp.183-186, 2017.

A. Couillault, K. Fort, G. Adda, ). , and H. M. , Evaluating corpora documentation with regards to the ethics and big data charter, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), 2014.
URL : https://hal.archives-ouvertes.fr/hal-00969180

K. Crawford, The trouble with bias, 2017.

, Keynote. Available on YouTube, 2020.

. Csa, La représentation des femmesà la télévision età la radio, 2017.

D. Doukhan, J. Carrive, F. Vallet, A. Larcher, and S. Meignier, An open-source speaker gender detection framework for monitoring gender equality, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5214-5218, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01927560

M. Garnerin, S. Rossato, and L. Besacier, Gender representation in French broadcast corpora and its impact on ASR performance, Proceedings of the 1st International Workshop on AI for Smart TV Content Production, Access and Delivery, AI4TV '19, pp.3-9, 2019.
URL : https://hal.archives-ouvertes.fr/halshs-02899392

F. Hernandez, V. Nguyen, S. Ghannay, N. Tomashenko, and Y. Estève, TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation, International Conference on Speech and Computer, pp.198-208, 2018.

D. Hovy and S. L. Spruit, The social impact of Natural Language Processing, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol.2, pp.591-598, 2016.

S. S. Juan, L. Besacier, B. Lecouteux, and M. Dyab, Using resources from a closely-related language to develop ASR for a very under-resourced language: a case study for Iban, Proceedings of the 16 th Annual Conference of the International Speech Communication Association (INTERSPEECH15), 2015.
URL : https://hal.archives-ouvertes.fr/hal-01170493

S. Macharia, L. Ndangam, M. Saboor, E. Franke, S. Parr et al., Who makes the news, Global Media Monitoring Project (GMMP), 2015.

M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman et al., Model cards for model reporting, Proceedings of the Conference on Fairness, Accountability, and Transparency, pp.220-229, 2019.

C. Nass and S. Brave, Wired for Speech: How Voice Activates and Advances the Human-computer Relationship, 2005.

V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, Librispeech: an ASR corpus based on public domain audio books, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5206-5210, 2015.

E. Vanmassenhove, C. Hardmeier, W. , and A. , Getting gender right in neural machine translation, Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp.3003-3008, 2018.

M. West, R. Kraut, E. Chew, and H. , I'd blush if I could: closing gender divides in digital skills through education, 2019.

M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton et al., , 2016.

, The FAIR guiding principles for scientific data management and stewardship, p.3

H. Zen, V. Dang, R. Clark, Y. Zhang, R. J. Weiss et al., LibriTTS: A corpus derived from LibriSpeech for text-to-speech, 2019.

, Crowdsourced high-quality UK and Ireland English Dialect speech data set. Google, distributed via OpenSLR, Language Resource References Google, 2019.

C. D. Hernandez-mena, TEDx Spanish Corpus. Audio and transcripts in Spanish taken from the TEDx Talks; shared under the CC BY-NC-ND 4.0 license, 2019.

M. Korvas, O. Plátek, O. Du?ek, . An?-zilka, . Luká? et al., Free English and Czech telephone speech corpus shared under the CC-BY-SA 3.0 license. Distributed via OpenSLR. SurfingTech. (NA), Free ST Chinese Mandarin Corpus. SurfingTech, distributed via OpenSLR, 2014.