CLEAR-Simple Corpus for Medical French - HAL-SHS - Sciences de l'Homme et de la Société Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

CLEAR-Simple Corpus for Medical French

Rémi Cardon
  • Fonction : Auteur
  • PersonId : 184596
  • IdHAL : remi-cardon

Résumé

Availability of corpora with technical and simplified contents is crucial for the development and test of methods for text simplification. We describe this kind of corpus for the French medical language. The corpus contains texts from three sources: encyclopedia, drug leaflets and scientific summaries. Each source proposes comparable information in specialized and plain languages. A subset of this corpus has been processed manually in order to find and align parallel sentences. This subset currently contains 663 pairs with parallel sentences. Alignment has been done by two annota-tors and shows 0.76 inter-annotator agreement .
Fichier principal
Vignette du fichier
grabar-ATA2018c.pdf (103.4 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

halshs-01968355 , version 1 (02-01-2019)

Identifiants

  • HAL Id : halshs-01968355 , version 1

Citer

Natalia Grabar, Rémi Cardon. CLEAR-Simple Corpus for Medical French. ATA, Nov 2018, Tilburg, Netherlands. ⟨halshs-01968355⟩
286 Consultations
763 Téléchargements

Partager

Gmail Facebook X LinkedIn More