Skip to Main content Skip to Navigation
Conference papers

Specifying a TEI-XML Based Format for Aligning Text to Image at Character Level

Abstract : This papers presents an experience of specifying and implementing an XML format for text to image alignment at word and character level within the TEI framework. The format in question is a supplementary markup layer applied to heterogeneous transcriptions of medieval Latin and French manuscripts encoded using different " flavors " of the TEI (normalized for critical editions, diplomatic or palaeographic transcriptions). One of the problems that had to be solved was identifying " non-alignable " spans in various kinds of transcriptions. Originally designed in the framework of a research project on the ontology of letter-forms in medieval Latin and vernacular (mostly French) manuscripts and inscriptions, this format can be of use for all kinds of projects that involve fine-grain alignment of transcriptions with zones on digital images.
Complete list of metadatas

Cited literature [3 references]  Display  Hide  Download
Contributor : Alexei Lavrentiev <>
Submitted on : Thursday, May 19, 2016 - 6:37:49 PM
Last modification on : Tuesday, May 12, 2020 - 3:56:12 PM


Explicit agreement for this submission



Alexei Lavrentiev, Dominique Stutzmann, Yann Leydier. Specifying a TEI-XML Based Format for Aligning Text to Image at Character Level. Symposium on Cultural Heritage Markup., Aug 2015, Washington, DC, United States. pp.BalisageVol16-Lavrentiev01, ⟨10.4242/BalisageVol16.Lavrentiev01⟩. ⟨halshs-01318701⟩



Record views


Files downloads