HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

Compte-rendu de fin de projet -Projet ANR-07-CORP-009 BOUVARD - Les Dossiers de Bouvard et Pécuchet de Flaubert. Enrichissement, valorisation, documentation d'un corpus multi supports : Programme " Corpus et outils de la Recherche en Sciences Humaines et Sociales " 2007

Abstract : Project title
The possible second volumes for Flaubert's unfinished novel Bouvard et Pécuchet
Challenges and objectives
The aim was to use the online publication of the preparatory documents left by the author to explore possible endings for this unfinished novel.
The BOUVARD project made it possible to publish a fragile and complex patrimonial set (combining manuscripts, printed and mixed materials) of high scientific and cultural significance on a dedicated website (http://dossiers-flaubert.ish-lyon.cnrs.fr/). This set is comprised of the documentation files that Flaubert gathered to write his last novel. These 2400 sheets are now kept at the Rouen library in the form of eight collections of various documents and two additional collections dedicated to the Dictionnaire des idées reçues. The online availability (through images and transcriptions) of a corpus that used to be very hard to access is accompanied by noteworthy scientific enrichments (search engine, metadata, annotations, and libraries). Its main asset is an original computing device producing configurable arrangements of quotations extracted from the published documents. Among other arrangements, web users can produce hypothetical reconstructions of Bouvard et Pécuchet's second volume. Flaubert had partially planned this volume and started to gather material, but death prevented him from completing the work.
Methods or technologies
Thanks to the full XML-TEI encoding of the corpus, the site offers both a multi-format online edition and a tool producing arrangements of quotations.
The corpus has been transcribed by the scientific project team and then encoded in XML-TEI, a free and open computer language that seeks to describe the logical structure of documents and to identify there different components. Thanks to this choice, the edition site offers several ways of visualizing each document from a single digital file. The site also offers a technical solution for the other part of the project geared toward the hypothetical reconstructions of the second volume of Bouvard et Pécuchet: it allowed to cut virtually the corpus into autonomous text fragments (according to the logic that Flaubert had already partially implemented when he began to write his "critical and farcical encyclopedia"), while ensuring that each fragment is connected both to the image area which it transcribes, and to the different textual units to which it belongs, such as the page where the fragment appears. Stored in a relational database, fragments can be gathered and organized according to various research hypotheses. The resulting arrangements of quotations (possible second volumes) may be exported in XML or PDF.
Major results
Bouvard et Pécuchet documentary files (text and images) are fully available on a dedicated website. Besides several tools (search engine and libraries), the site provides access to a corpus of 2400 pages transcribed in four formats: ultra-diplomatic, diplomatic, normalized and enriched. It also allows readers to produce arrangements of quotations on demand, including possible second volumes of Flaubert's unfinished novel. As a composition and structuring tool of the published work itself, this project seeks to extend the benefits of a critical edition to fragmentary textual contents.
Scientific production
Five conferences (organized in France, Italy and Japan) took place during the project which was also the source of more than 50 publications. Results are available to the scientific community at large (articles are deposited in the open access archive HAL, free data access is offered on the project's website) and attention was paid to sustainability and interoperability of corpus data through the use of the international, free and open encoding standard XML-TEI.
Factual information
The BOUVARD project is a basic research project completed by an international team of scientists and scholars under the leadership of Stéphanie Dord-Crouslé, a researcher in the UMR 5611 LIRE (Literature, Ideologies and Representations in the 18th and 19th centuries). Its technical implementation was carried out by the ISH (Institute for Human Sciences). The project began in January 2008 and lasted 54 months. It was supported by an ANR grant in the amount of 150 000 Euros for a full cost of about 900 000 Euros.
Document type :
Complete list of metadata

Cited literature [55 references]  Display  Hide  Download

Contributor : Stéphanie Dord-Crouslé Connect in order to contact the contributor
Submitted on : Tuesday, December 4, 2012 - 3:22:50 PM
Last modification on : Tuesday, October 19, 2021 - 10:50:45 PM
Long-term archiving on: : Tuesday, March 5, 2013 - 3:52:51 AM


Files produced by the author(s)


  • HAL Id : halshs-00760914, version 1


Stéphanie Dord-Crouslé. Compte-rendu de fin de projet -Projet ANR-07-CORP-009 BOUVARD - Les Dossiers de Bouvard et Pécuchet de Flaubert. Enrichissement, valorisation, documentation d'un corpus multi supports : Programme " Corpus et outils de la Recherche en Sciences Humaines et Sociales " 2007. [Rapport de recherche] ANR (Agence Nationale de la Recherche - France). 2012. ⟨halshs-00760914⟩



Record views


Files downloads