Building, Encoding, and Annotating a Corpus of Parliamentary Debates in XML-TEI: A Cross-Linguistic Account - HAL Accéder directement au contenu
Article dans une revue Journal of the Text Encoding Initiative Année : 2021

Building, Encoding, and Annotating a Corpus of Parliamentary Debates in XML-TEI: A Cross-Linguistic Account

Résumé

This data paper introduces an integrative and comprehensive method for the linguistic annotation of parliamentary discourse. Initially conceived as a documentation for a specific and rather small-scale research project, the annotation scheme takes into account national specificities and is geared to proposing an annotation scheme that is both highly standardised and adaptable to other research contexts. The paper reads as a specific application of the Text Encoding Initiative (TEI) framework applied to a subset of parliamentary debates. This strategy has two main applications: first, to develop a model for the encoding of parliamentary corpora by providing a systematic way of annotating both elements within the text (e.g. turns, incidents, interruptions) and the metadata associated with it (e.g. variables pertaining to the speaker or the speech event); second, to provide a cross-linguistic empirical basis for further annotation projects.
Fichier principal
Vignette du fichier
Truan, Romary 2021, Building, Encoding, and Annotating a Corpus of Parliamentary Debates in TEI XML.pdf ( 366.08 Ko ) Télécharger
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

halshs-03097333, version 1 (05-01-2021)
halshs-03097333, version 2 (09-02-2021)
halshs-03097333, version 3 (22-10-2021)
halshs-03097333, version 4 (24-06-2022)

Licence

Paternité - CC BY 4.0

Identifiants

  • HAL Id : halshs-03097333 , version 4

Citer

Naomi Truan, Laurent Romary. Building, Encoding, and Annotating a Corpus of Parliamentary Debates in XML-TEI: A Cross-Linguistic Account. Journal of the Text Encoding Initiative, 2021, 14. ⟨halshs-03097333v4⟩
782 Consultations
1079 Téléchargements
Dernière date de mise à jour le 20/04/2024
comment ces indicateurs sont-ils produits

Partager

Gmail Facebook Twitter LinkedIn Plus