Improving Automatic Categorization of Technical vs. Laymen Medical Words using FastText Word Embeddings - HAL Accéder directement au contenu
Communication dans un congrès Année : 2018

Improving Automatic Categorization of Technical vs. Laymen Medical Words using FastText Word Embeddings

Résumé

Detection of difficult for understanding words is a crucial task for ensuring the proper understanding of medical texts such as diagnoses and drug instructions. In this paper, we study usage of recently developed word embeddings, which contain context information for words together with other linguistic and non-linguistic features, for improving the detection of difficult medical words. We propose new cross-validation scenarios in order to test the generalization ability of the medical words difficulty detection from different perspectives and provide the experimental study of previously used methods for feature extraction together with recently proposed FastText embeddings. We found that for known words and unknown users FastText embeddings surely improves the detection of word understandability reaching 85.9 F-score (up to 2.9 F-score improvement).
Fichier principal
Vignette du fichier
pylieva-IDDM2018.pdf ( 383.01 Ko ) Télécharger
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

halshs-01968357, version 1 (02-01-2019)

Identifiants

  • HAL Id : halshs-01968357 , version 1

Citer

Hanna Pylieva, Artem Chernodub, Natalia Grabar, Thierry Hamon. Improving Automatic Categorization of Technical vs. Laymen Medical Words using FastText Word Embeddings. 1st International Workshop on Informatics & Data-Driven Medicine (IDDM 2018), Nov 2018, Lviv, Ukraine. ⟨halshs-01968357⟩
834 Consultations
344 Téléchargements
Dernière date de mise à jour le 20/04/2024
comment ces indicateurs sont-ils produits

Partager

Gmail Facebook Twitter LinkedIn Plus