HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

The textometric concept of active corpus: Illustration by an analysis scenario based on annotation then projection

Abstract : Active corpus provides the possibility to apply searching and statistical computing as if corpus were reduced to selected words, whereas full text still remains visible in context display. This is mainly implemented in paradigmatic processing, yet it may concern syntagmatic processing or text display too. Here we experiment active corpus in syntagmatic processing. A projection generates a new corpus, in which words are semantic tags that were automatically assigned in a first step to the original data. This new corpus makes it easy to explore tag sequences, with any generic textometric tool available, however sparse the original annotation may be. This methodological path was applied to film grammar analysis on 10,000 archival descriptions of news reports. 19 camera shot and angle types were ed through queries and tagged. This annotation became the lexicon of the projected corpus that was used to study shot sequences. The annotation and projection tools we have run are available as utilities in TXM open-sourcesoftware and should usefully serve many research projects.
Complete list of metadata

Contributor : Bénédicte Pincemin Connect in order to contact the contributor
Submitted on : Sunday, May 15, 2022 - 10:45:54 PM
Last modification on : Wednesday, May 18, 2022 - 10:45:34 AM


 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : 2022-07-06

Please log in to resquest access to the document


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : halshs-03667319, version 1


Bénédicte Pincemin, Serge Heiden, Franck Mazuet. The textometric concept of active corpus: Illustration by an analysis scenario based on annotation then projection. 16th International Conference on Statistical Analysis of Textual Data JADT 2022, VADISTAT - Per Simona Balbi, Univ. of Naples Federico II, Jul 2022, Naples, Italy. ⟨halshs-03667319⟩



Record views