Can word vectors help corpus linguists? - HAL-SHS - Sciences de l'Homme et de la Société Accéder directement au contenu
Article Dans Une Revue Studia Neophilologica Année : 2019

Can word vectors help corpus linguists?

Les vecteurs lexicaux peuvent-ils venir en aide aux linguistes de corpus ?

Résumé

Two recent methods based on distributional semantic models (DSMs) have proved very successful in learning high-quality vector representations of words from large corpora: word2vec (Mikolov, Chen, et al. 2013; Mikolov, Yih, et al. 2013) and GloVe (Pennington et al. 2014). Once trained on a very large corpus, these algorithms produce distributed representations for words in the form of vectors. DSMs based on deep learning and neural networks have proved efficient in representing the meaning of individual words. In this paper, I assess to what extent state-of-the-art word-vector semantics can help corpus linguists annotate large datasets for semantic classes. Although word vectors suggest decisive opportunities for resolving semantic annotation issues, it has yet to improve in terms of its representation of polysemy, homonymy, and multiword expressions.
Fichier principal
Vignette du fichier
wordvecs.pdf (622.45 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

halshs-01657591 , version 1 (06-12-2017)
halshs-01657591 , version 2 (03-10-2018)

Identifiants

Citer

Guillaume Desagulier. Can word vectors help corpus linguists?. Studia Neophilologica, 2019, ⟨10.1080/00393274.2019.1616220⟩. ⟨halshs-01657591v2⟩
476 Consultations
1475 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More