High-level discourse structures: topical chains and enumerative structures in a diversified annotated corpus

Abstract : One of the outcomes of the ANNODIS project (Ho-Dac et al 2009, 2010) is a diversified corpus annotated with two frequent textual motifs: topical chains - TCs - and enumerative structures - ESs. The corpus has been manually annotated with both the motifs and the clues signalling them. These data can now be exploited in a comparative mode in order to examine TCs and ESs in the three sub-corpora: 1) reports in the field of international relations; 2) scientific articles (proceedings of a linguistics conference); 3) encyclopaedia articles (from Wikipedia). The initial step is to take a quantitative look at each motif: composition, distribution, and match with document structure (Power et al 2003). Though the motifs are common in all three corpora, differences appear in their frequency, in their length and coverage (proportion of text involved), in their composition (for ESs: number of items, presence of a trigger and/or closure). Another important aspect is their granularity: this notion is approximated via a typology in which types correspond to different forms of interaction between the motifs and the document's layout structure (sections and headings, formatted lists and paragraphs). We then examine the data from several qualitative angles in order to arrive at a functional characterisation of the motifs. Of special interest to us is the link between particular forms of signalling and specific functions: ESs with items introduced by sequencers, for instance, are functionally different from ESs whose items are introduced by circumstantial adverbials. A continuum is proposed from ESs signalled by purely textual cues (e.g. bullet points) to ESs whose cues carry ideational contents (such as adverbials) (Halliday 1977). The different corpora are compared in terms of the functional classes and their linguistic correlates, Finally, the two motifs are observed in context and in their interaction. A special case of interaction concerns ESs interacting with themselves via recursivity, a remarkably frequent occurrence in our corpus. This analysis of how the motifs behave in text also leads to cross-corpus comparisons.
Document type :
Conference papers
Complete list of metadatas

Cited literature [3 references]  Display  Hide  Download

https://halshs.archives-ouvertes.fr/halshs-00953561
Contributor : Ludovic Tanguy <>
Submitted on : Friday, February 28, 2014 - 12:31:44 PM
Last modification on : Wednesday, July 10, 2019 - 1:33:49 AM
Long-term archiving on : Wednesday, May 28, 2014 - 1:25:20 PM

File

Abstract2_CL_2011_Birmingham.p...
Files produced by the author(s)

Identifiers

  • HAL Id : halshs-00953561, version 1

Collections

Citation

Lydia-Mai Ho-Dac, Cécile Fabre, Marie-Paule Péry-Woodley, Josette Rebeyrolle, Ludovic Tanguy. High-level discourse structures: topical chains and enumerative structures in a diversified annotated corpus. Corpus Linguistics, 2011, Birmingham, United Kingdom. ⟨halshs-00953561⟩

Share

Metrics

Record views

315

Files downloads

148