Skip to Main content Skip to Navigation
Conference papers

The project and TEI : encoding structured historical data in XML texts

Abstract : The Système modulaire de gestion de l’information historique (SyMoGIH) is a project developed since 2008 by the Pôle histoire numérique of the LARHRA (CNRS UMR 5190, Universités de Lyon et Grenoble) in order to store, analyse and publish historical structured data on a modular and collaborative platform. More then 50 scholars and students, as well as 10 research programs, used or are currently using the platform for research purposes. Some of the data collected in the collaborative database are published on the main project website and on websites devoted to different research programs, like Siprojuris. These data will soon be available on the semantic web using a project specific ontology. The ontology can be used, on the one hand, to extract structured historical data from texts and store them in databases or triple-stores. On the other hand, if digitized texts are available it is also possible to use the TEI’s markup language in combination with the ontology in order to encode structured data directly in XML texts. Two research programs and some scholars and PhD students are currently using this approach to produce historical data. We intend to promote this practice in our platform and I have introduced a simplified version of this approach in my teaching about digital tools for historians at master’s level. In this paper, I will present the markup concepts we adopted to encode structured historical data in TEI texts. Ontology and texts are connected using an approach similar to « method A » presented by Øyvind Eide in a recent article of the TEI Journal. I also intend to present an example of encoding a corpus of biographical records so that you can visualise and analyse historical data directly marked up in the text using the ontology.
Complete list of metadatas
Contributor : Francesco Beretta <>
Submitted on : Thursday, September 29, 2016 - 12:16:04 AM
Last modification on : Monday, July 20, 2020 - 3:38:02 PM


Distributed under a Creative Commons Attribution - ShareAlike 4.0 International License


  • HAL Id : halshs-01251915, version 1



Francesco Beretta. The project and TEI : encoding structured historical data in XML texts. Text Encoding Initiative Conference and Members’ Meeting 2015. Connect, Animate, Innovate., Oct 2015, Lyon, France. ⟨halshs-01251915⟩



Record views


Files downloads