High-level discourse structures: topical chains and enumerative structures in a diversified annotated corpus - HAL-SHS - Sciences de l'Homme et de la Société Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

High-level discourse structures: topical chains and enumerative structures in a diversified annotated corpus

Résumé

One of the outcomes of the ANNODIS project (Ho-Dac et al 2009, 2010) is a diversified corpus annotated with two frequent textual motifs: topical chains - TCs - and enumerative structures - ESs. The corpus has been manually annotated with both the motifs and the clues signalling them. These data can now be exploited in a comparative mode in order to examine TCs and ESs in the three sub-corpora: 1) reports in the field of international relations; 2) scientific articles (proceedings of a linguistics conference); 3) encyclopaedia articles (from Wikipedia). The initial step is to take a quantitative look at each motif: composition, distribution, and match with document structure (Power et al 2003). Though the motifs are common in all three corpora, differences appear in their frequency, in their length and coverage (proportion of text involved), in their composition (for ESs: number of items, presence of a trigger and/or closure). Another important aspect is their granularity: this notion is approximated via a typology in which types correspond to different forms of interaction between the motifs and the document's layout structure (sections and headings, formatted lists and paragraphs). We then examine the data from several qualitative angles in order to arrive at a functional characterisation of the motifs. Of special interest to us is the link between particular forms of signalling and specific functions: ESs with items introduced by sequencers, for instance, are functionally different from ESs whose items are introduced by circumstantial adverbials. A continuum is proposed from ESs signalled by purely textual cues (e.g. bullet points) to ESs whose cues carry ideational contents (such as adverbials) (Halliday 1977). The different corpora are compared in terms of the functional classes and their linguistic correlates, Finally, the two motifs are observed in context and in their interaction. A special case of interaction concerns ESs interacting with themselves via recursivity, a remarkably frequent occurrence in our corpus. This analysis of how the motifs behave in text also leads to cross-corpus comparisons.

Domaines

Linguistique
Fichier principal
Vignette du fichier
Abstract2_CL_2011_Birmingham.pdf (126.88 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

halshs-00953561 , version 1 (28-02-2014)

Identifiants

  • HAL Id : halshs-00953561 , version 1

Citer

Lydia-Mai Ho-Dac, Cécile Fabre, Marie-Paule Péry-Woodley, Josette Rebeyrolle, Ludovic Tanguy. High-level discourse structures: topical chains and enumerative structures in a diversified annotated corpus. Corpus Linguistics, 2011, Birmingham, United Kingdom. ⟨halshs-00953561⟩
194 Consultations
92 Téléchargements

Partager

Gmail Facebook X LinkedIn More