UMLF : a unified medical lexicon for French
Pierre Zweigenbaum
(1)
,
Robert Baud
(2)
,
Anita Burgun
(3)
,
Fiammetta Namer
(4)
,
Eric Jarrousse
,
Natalia Grabar
(5)
,
Patrick Ruch
(2)
,
Franck Le Duff
(3)
,
Jean-François Forget
,
Magali Douyère
(6)
,
Stéfan J. Darmoni
(6)
1
LIMSI -
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
2 SIM - Service d'informatique médicale
3 LIM - Laboratoire d'Informatique Médicale
4 ATILF - Analyse et Traitement Informatique de la Langue Française
5 CRC (UMR_S 872) - Centre de Recherche des Cordeliers
6 L@STICS
2 SIM - Service d'informatique médicale
3 LIM - Laboratoire d'Informatique Médicale
4 ATILF - Analyse et Traitement Informatique de la Langue Française
5 CRC (UMR_S 872) - Centre de Recherche des Cordeliers
6 L@STICS
Pierre Zweigenbaum
- Fonction : Auteur
- PersonId : 14995
- IdHAL : pierre-zweigenbaum
- ORCID : 0000-0001-8410-4808
- IdRef : 06664268X
Fiammetta Namer
- Fonction : Auteur
- PersonId : 751632
- IdHAL : fiammetta-namer
- ORCID : 0000-0002-6144-3011
Eric Jarrousse
- Fonction : Auteur
Natalia Grabar
- Fonction : Auteur
- PersonId : 6735
- IdHAL : natalia-grabar
- ORCID : 0000-0002-0237-4554
- IdRef : 089015460
Jean-François Forget
- Fonction : Auteur
Stéfan J. Darmoni
- Fonction : Auteur
- PersonId : 180619
- IdHAL : stefan-darmoni
- ORCID : 0000-0002-7162-318X
- IdRef : 03514243X
Résumé
Medical Informatics has a constant need for basic medical language processing tasks, e.g. for coding into controlled vocabularies, free text indexing and information retrieval. Most of these tasks involve term matching and rely on lexical resources: lists of words with attached information, including inflected forms and derived words, etc. Such resources are publicly available for the English language with the UMLS Specialist Lexicon, but not in other languages. For the French language, several teams have worked on the subject and built local lexical resources. The goal of the present work is to pool and unify these resources and to add extensively to them by exploiting medical terminologies and corpora, resulting in a unified medical lexicon for French (UMLF). This paper exposes the issues raised by such an objective, describes the methods on which the project relies and illustrates them with experimental results.
Domaines
LinguistiqueFormat du dépôt | Notice |
---|---|
Type de dépôt | Article dans une revue |
Titre |
en
UMLF : a unified medical lexicon for French
|
Résumé |
en
Medical Informatics has a constant need for basic medical language processing tasks, e.g. for coding into controlled vocabularies, free text indexing and information retrieval. Most of these tasks involve term matching and rely on lexical resources: lists of words with attached information, including inflected forms and derived words, etc. Such resources are publicly available for the English language with the UMLS Specialist Lexicon, but not in other languages. For the French language, several teams have worked on the subject and built local lexical resources. The goal of the present work is to pool and unify these resources and to add extensively to them by exploiting medical terminologies and corpora, resulting in a unified medical lexicon for French (UMLF). This paper exposes the issues raised by such an objective, describes the methods on which the project relies and illustrates them with experimental results.
|
Auteur(s) |
Pierre Zweigenbaum
1
, Robert Baud
2
, Anita Burgun
3
, Fiammetta Namer
4
, Eric Jarrousse
, Natalia Grabar
5
, Patrick Ruch
2
, Franck Le Duff
3
, Jean-François Forget
, Magali Douyère
6
, Stéfan J. Darmoni
6
1
LIMSI -
Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur
( 247329 )
- Université Paris-Sud Bât. 507 - Rue du Belvédère -91405 ORSAY CEDEX
- France
2
SIM -
Service d'informatique médicale
( 103001 )
- France
3
LIM -
Laboratoire d'Informatique Médicale
( 32594 )
- Laboratoire d'Informatique Médicale CHU - Pontchaillou 2, rue Henri Le Guilloux 35033 RENNES
- France
4
ATILF -
Analyse et Traitement Informatique de la Langue Française
( 190838 )
- Université de Lorraine, 44 Av de la Libération, BP 30687 54063 Nancy Cedex
- France
5
CRC (UMR_S 872) -
Centre de Recherche des Cordeliers
( 27744 )
- CRBM des Cordeliers 15, rue de l'ecole de medecine batiment E 75270 Paris cedex 06
- France
6
L@STICS
( 103002 )
- France
|
Comité de lecture |
Oui
|
Vulgarisation |
Non
|
Langue du document |
Anglais
|
Nom de la revue |
|
Audience |
Internationale
|
Date de publication |
2005
|
Volume |
74
|
Numéro |
2-4
|
Page/Identifiant |
119-124
|
Domaine(s) |
|
Mots-clés |
en
natural language processing, morphology, French, lexical database
|
DOI | 10.1016/j.ijmedinf.2004.03.010 |
Loading...