Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Building, Encoding, and Annotating a Corpus of Parliamentary Debates in XML-TEI: A Cross-Linguistic Account

Abstract : This data paper introduces an integrative and comprehensive method for the linguistic annotation of parliamentary discourse. Initially conceived as a documentation for a specific and rather small-scale research project, the annotation scheme takes into account national specificities and is geared to proposing an annotation scheme that is both highly standardised and adaptable to other research contexts. The paper reads as a specific application of the Text Encoding Initiative (TEI) framework applied to a subset of parliamentary debates. This strategy has two main applications: first, to develop a model for the encoding of parliamentary corpora by providing a systematic way of annotating both elements within the text (e.g. turns, incidents, interruptions) and the metadata associated with it (e.g. variables pertaining to the speaker or the speech event); second, to provide a cross-linguistic empirical basis for further annotation projects.
Document type :
Preprints, Working Papers, ...
Complete list of metadata

https://halshs.archives-ouvertes.fr/halshs-03097333
Contributor : Laurent Romary <>
Submitted on : Tuesday, February 9, 2021 - 5:30:44 PM
Last modification on : Wednesday, February 17, 2021 - 10:58:27 AM

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

  • HAL Id : halshs-03097333, version 2

Collections

Citation

Naomi Truan, Laurent Romary. Building, Encoding, and Annotating a Corpus of Parliamentary Debates in XML-TEI: A Cross-Linguistic Account. 2020. ⟨halshs-03097333v2⟩

Share

Metrics

Record views

76

Files downloads

200