Coreference Resolution for French Oral Data: Machine Learning Experiments with ANCOR

We present CROC (Coreference Resolution for Oral Corpus), the first machine learning system for coreference resolution in French. One specific aspect of the system is that it has been trained on data that come exclusively from transcribed speech, namely ANCOR (ANaphora and Coreference in ORal corpus), the first large-scale French corpus with anaphorical relation annotations. In its current state, the CROC system requires pre-annotated mentions. We detail the features used for the learning algorithms, and we present a set of experiments with these features. The scores we obtain are close to those of state-of-the-art systems for written English.

Mots clés

mention-pair model dialogue corpus coreference resolution machine learning

Domaines

Informatique et langage [cs.CL] Linguistique Sciences de l'information et de la communication

Fichier principal

Coreference_Resolution_for_French_Oral_Data_Machine_Learning_Experiments_with_ANCOR.pdf (236.16 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Jean-Yves Antoine : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01344977

Soumis le : mercredi 13 juillet 2016-03:29:10

Dernière modification le : vendredi 16 février 2024-18:16:04

Dates et versions

hal-01344977 , version 1 (13-07-2016)

Identifiants

HAL Id : hal-01344977 , version 1

Citer

Adèle Désoyer, Frédéric Landragin, Isabelle Tellier, Anaïs Lefeuvre, Jean-Yves Antoine, et al.. Coreference Resolution for French Oral Data: Machine Learning Experiments with ANCOR. 17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing'2016), Apr 2016, Konya, Turkey. ⟨hal-01344977⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UNIV-TOURS CNRS UNIV-PARIS3 LATTICE MODYCO LIBDTLN PSL USPC LIFAT INSA-GROUPE DEMOCRAT INSA-CVL UNIV-PARIS-LUMIERES ANR UNIV-PARIS-NANTERRE

467 Consultations

606 Téléchargements