Specifying a TEI-XML Based Format for Aligning Text to Image at Character Level

Abstract : This papers presents an experience of specifying and implementing an XML format for text to image alignment at word and character level within the TEI framework. The format in question is a supplementary markup layer applied to heterogeneous transcriptions of medieval Latin and French manuscripts encoded using different " flavors " of the TEI (normalized for critical editions, diplomatic or palaeographic transcriptions). One of the problems that had to be solved was identifying " non-alignable " spans in various kinds of transcriptions. Originally designed in the framework of a research project on the ontology of letter-forms in medieval Latin and vernacular (mostly French) manuscripts and inscriptions, this format can be of use for all kinds of projects that involve fine-grain alignment of transcriptions with zones on digital images.
Complete list of metadatas

Cited literature [3 references]  Display  Hide  Download

https://halshs.archives-ouvertes.fr/halshs-01318701
Contributor : Alexei Lavrentiev <>
Submitted on : Thursday, May 19, 2016 - 6:37:49 PM
Last modification on : Saturday, November 3, 2018 - 4:46:10 PM

File

Bal2015lavr0825.pdf
Explicit agreement for this submission

Identifiers

Citation

Alexei Lavrentiev, Dominique Stutzmann, Yann Leydier. Specifying a TEI-XML Based Format for Aligning Text to Image at Character Level. Symposium on Cultural Heritage Markup., Aug 2015, Washington, DC, United States. pp.BalisageVol16-Lavrentiev01, ⟨10.4242/BalisageVol16.Lavrentiev01⟩. ⟨halshs-01318701⟩

Share

Metrics

Record views

370

Files downloads

285