Les coréférences à l'oral : une expérience d'apprentissage automatique sur le corpus ANCOR

Abstract : We present CROC (Coreference Resolution for Oral Corpus), the first machine learning system for coreference resolution in French. One specific aspect of the system is that it has been trained on data that are exclusively oral, namely ANCOR (ANaphora and Coreference in ORal corpus), the first corpus in oral French with anaphorical relations annotations. In its current state, the CROC system requires pre-annotated mentions. We detail the features that we chose to be used by the learning algorithms, and we present a set of experiments with these features. The scores we obtain are close to those of state-of-the-art systems for written English. Then we give future works on the design of an end-to-end system for oral and written French.
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download

https://halshs.archives-ouvertes.fr/halshs-01153297
Contributor : Frédéric Landragin <>
Submitted on : Tuesday, May 19, 2015 - 3:23:12 PM
Last modification on : Wednesday, May 22, 2019 - 3:46:02 PM
Long-term archiving on : Tuesday, September 15, 2015 - 6:16:37 AM

File

14_TAL.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : halshs-01153297, version 1

Citation

Adèle Désoyer, Frédéric Landragin, Isabelle Tellier, Anaïs Lefeuvre, Jean-Yves Antoine. Les coréférences à l'oral : une expérience d'apprentissage automatique sur le corpus ANCOR. Traitement Automatique des Langues, ATALA, 2015, Traitement automatique du langage parlé, 55 (2), pp.97-121. ⟨http://www.atala.org/-Volume-55-⟩. ⟨halshs-01153297⟩

Share

Metrics

Record views

731

Files downloads

276