HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Lexicométrie sur corpus étiquetés

Abstract : Tagged corpus are now widely available, and are of great interest for textual and linguistic studies. Some lexicometric softwares have new versions to handle such corpus, but these don't give complete satisfaction yet. However, a clear and powerful model of text for lexicometric procedures has been formalized, as a string of positions ; in each position one or several types are instanciated, from one or several sets of types, such as a set of spellings, or a set of lemmas, or a set of grammatical codes.
As regards the types definition, the way these kinds of linguistic information are recorded (the record axes) should not be confused with the views one can wish for a lexicometric analysis (the analysis axes). Actually, record axes are often irrelevant analysis axes. As regards the string of positions, some positions may be removed for the purposes of the analysis, so as to define the appropriate background retained from the text. Then the analysis can also be focussed on a given pattern, standing out against the background. We finally propose means to complete the results' display. These are naturally expressed and organized according to the analysis axis, but the introduction of views from some other axes may clarify, adjust or enrich their interpretation.
Complete list of metadata

Cited literature [7 references]  Display  Hide  Download

https://halshs.archives-ouvertes.fr/halshs-00168988
Contributor : Bénédicte Pincemin Connect in order to contact the contributor
Submitted on : Tuesday, April 21, 2009 - 5:02:47 PM
Last modification on : Tuesday, January 25, 2022 - 3:50:48 AM
Long-term archiving on: : Tuesday, September 18, 2012 - 11:55:59 AM

Files

pincemin_jadt04_texte.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : halshs-00168988, version 1

Citation

Bénédicte Pincemin. Lexicométrie sur corpus étiquetés. 7es Journées internationales d'analyse statistique des données textuelles (JADT 2004), Mar 2004, Louvain-la-Neuve, Belgique. pp.865-873. ⟨halshs-00168988⟩

Share

Metrics

Record views

196

Files downloads

349