Skip to Main content Skip to Navigation
Journal articles

Clustering Flood Events from Water Quality Time-Series using Latent Dirichlet Allocation Model

Abstract : To improve hydro-chemical modeling and forecasting, there is a need to better understand flood-induced variability in water chemistry and the processes controlling it in watersheds. In the literature, assumptions are often made, for instance, that stream chemistry reacts differently to rainfall events depending on the season; however, methods to verify such assumptions are not well developed. Often, few floods are studied at a time and chemicals are used as tracers. Grouping similar events from large multivariate datasets using principal component analysis and clustering methods helps to explain hydrological processes; however, these methods currently have some limits (definition of flood descriptors, linear assumption, for instance). Most clustering methods have been used in the context of regionalization, focusing more on mapping results than on understanding processes. In this study, we extracted flood patterns using the probabilistic Latent Dirichlet Allocation (LDA) model, its first use in hydrology, to our knowledge. The LDA method allows multivariate temporal datasets to be considered without having to define explanatory factors beforehand or select representative floods. We analyzed a multivariate dataset from a long-term observatory (Kervidy-Naizin, western France) containing data for four solutes monitored daily for 12 years: nitrate, chloride, dissolved organic carbon, and sulfate. The LDA method extracted four different patterns that were distributed by season. Each pattern can be explained by seasonal hydrological processes. Hydro-meteorological parameters help explain the processes leading to these patterns, which increases understanding of flood-induced variability in water quality. Thus, the LDA method appears useful for analyzing long-term datasets.
Complete list of metadatas

Cited literature [37 references]  Display  Hide  Download
Contributor : Romain Tavenard <>
Submitted on : Friday, November 29, 2013 - 12:14:35 PM
Last modification on : Friday, September 18, 2020 - 2:34:24 PM
Long-term archiving on: : Monday, March 3, 2014 - 2:26:28 PM


Files produced by the author(s)



Alice Aubert, Romain Tavenard, Rémi Emonet, Alban de Lavenne, Simon Malinowski, et al.. Clustering Flood Events from Water Quality Time-Series using Latent Dirichlet Allocation Model. Water Resources Research, American Geophysical Union, 2013, 49 (12), pp.8187-8199. ⟨10.1002/2013WR014086⟩. ⟨halshs-00906292⟩



Record views


Files downloads