Skip to Main content Skip to Navigation
Journal articles

Extraire et encoder l'information lexicale de Wiktionary : quel boulot pour étrangler le goulot !

Abstract : We present in this article an effort carried out for a decade which consists in using the content of the Wiktionary collaborative dictionary in order to build free lexical resources. Its main result is the design of machine-readable dictionaries and inflectional lexicons for three languages (French, Italian and English). In this paper, we question the usefulness of such lexical resources at a time when mainstream NLP is based on machine learning and readily do without. We compare different methods of producing resources and more specifically of extracting information from Wiktionary. We then discuss the suitability of standard formats for encoding idiosyncratic resources such as Wiktionary and conclude on the need to prioritize, above all, the production and sharing of resources.
Document type :
Journal articles
Complete list of metadatas

https://halshs.archives-ouvertes.fr/halshs-03083521
Contributor : Franck Sajous <>
Submitted on : Saturday, December 19, 2020 - 10:57:11 AM
Last modification on : Thursday, January 7, 2021 - 3:37:17 AM

File

SajousEtAl2020_Lexique27_Extra...
Explicit agreement for this submission

Licence


Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives 4.0 International License

Identifiers

  • HAL Id : halshs-03083521, version 1

Collections

Citation

Franck Sajous, Basilio Calderone, Nabil Hathout. Extraire et encoder l'information lexicale de Wiktionary : quel boulot pour étrangler le goulot !. Lexique, Presses Universitaires du Septentrion, 2020, Ressources Lexicales, 27. ⟨halshs-03083521⟩

Share

Metrics

Record views

62

Files downloads

13