HS3.5 | Deep Learning in Hydrology
EDI
Deep Learning in Hydrology
Convener: Riccardo Taormina | Co-conveners: Frederik Kratzert, Eduardo Acuna, Basil Kraft, Maria-Luisa Taccari
Orals
| Tue, 05 May, 08:30–12:30 (CEST)
 
Room C
Posters on site
| Attendance Tue, 05 May, 16:15–18:00 (CEST) | Display Tue, 05 May, 14:00–18:00
 
Hall A
Posters virtual
| Thu, 07 May, 14:12–15:45 (CEST)
 
vPoster spot A, Thu, 07 May, 16:15–18:00 (CEST)
 
vPoster Discussion
Orals |
Tue, 08:30
Tue, 16:15
Thu, 14:12
Deep Learning has seen accelerated adoption across Hydrology and the broader Earth Sciences. This session highlights the continued integration of deep learning and its many variants into traditional and emerging hydrology-related workflows. We welcome abstracts related to novel theory development, new methodologies, or practical applications of deep learning in hydrological modeling and process understanding. This might include, but is not limited to, the following:

(1) Development of novel deep learning models or modeling workflows.
(2) Probing, exploring and improving our understanding of the (internal) states/representations of deep learning models to improve models and/or gain system insights.
(3) Understanding the reliability of deep learning, e.g., under non-stationarity and climate change.
(4) Modeling human behavior and impacts on the hydrological cycle.
(5) Deep Learning approaches for extreme event analysis, detection, and mitigation.
(6) Natural Language Processing in support of models and/or modeling workflows.
(7) Applications of Large Language Models and Large Multimodal Models (e.g. ChatGPT, Gemini, etc.) in the context of hydrology.
(8) Uncertainty estimation for and with Deep Learning.
(9) Advances towards foundational models in the context of hydrology and Earth Sciences more generally.
(10) Exploration of different training strategies, such as self-supervised learning, unsupervised learning, and reinforcement learning.

Orals: Tue, 5 May, 08:30–12:30 | Room C

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears 15 minutes before the time block starts.
Chairpersons: Riccardo Taormina, Maria-Luisa Taccari, Frederik Kratzert
08:30–08:35
Streamflow
08:35–08:45
|
EGU26-16120
|
ECS
|
On-site presentation
Regionalised Fine-Tuning of LSTMs for Streamflow Prediction in Ungauged Catchments
(withdrawn)
Ashkan Shokri, James Bennett, and David Robertson
08:45–08:55
|
EGU26-5631
|
ECS
|
On-site presentation
Nicolas Lazaro, Tobias Siegfried, and Sandro Hunziker
Reconstructing historical streamflow in ungauged basins remains a fundamental challenge in hydrology. This is especially true in data-sparse regions where infrastructure planning requires long-term discharge records that do not exist. Deep learning models trained on large-sample datasets can predict streamflow at locations excluded from training. However, a critical question persists: without observations, how can we assess reconstruction reliability? In this work, we develop and evaluate a framework for streamflow reconstruction in truly ungauged basins. We use two recurrent neural network architectures—Long Short-Term Memory (LSTM) and Mamba—trained on globally distributed catchments from the Caravan dataset. Training basins are selected using shape-based time-series clustering with Dynamic Time Warping. This ensures hydrological similarity to target regions. Models are driven by fused multi-source precipitation products (ERA5-Land, CHIRPS, MSWEP, CPC) alongside static catchment attributes. No local calibration is required. We propose ensemble disagreement—the spread among independently trained model instances from cross-validation—as a proxy for reconstruction quality. On a 100-basin holdout set, we demonstrate a negative correlation between ensemble disagreement and Nash-Sutcliffe Efficiency: basins where models agree tend to achieve higher reconstruction skill. This relationship provides practitioners with a principled basis for assigning confidence to streamflow reconstructions in ungauged basins, even in the absence of ground truth.
 

How to cite: Lazaro, N., Siegfried, T., and Hunziker, S.: Uncertainty Quantification for Deep Learning Streamflow Reconstruction in Ungauged Basins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5631, https://doi.org/10.5194/egusphere-egu26-5631, 2026.

08:55–09:05
|
EGU26-8698
|
On-site presentation
Yiqun Sun, Xin Tian, Hamid Moradkhani, Qiongfang Li, Peng Shi, Simin Qu, and Qihui Chen

Hydrological data assimilation (DA) is commonly implemented with Kalman-type filters whose performance depends strongly on prescribed (often time-invariant) process and observation error covariances. In real catchments, however, model errors are non-stationary and state-dependent, making covariance tuning difficult and poorly transferable across events and forecast horizons. Pure deep learning models can be flexible but may drift from process constraints and provide limited interpretability for state corrections.

We propose a differentiable deep learning–Kalman filter hybrid DA framework that learns a time-varying Kalman gain inside the recursive loop of a process-based hydrological model. Specifically, we preserve the Kalman-style update structure while an LSTM-based gain module ingests model states and innovations and outputs an assimilation gain for state updating at each time step. The coupled system (physical model + neural gain) is implemented end-to-end and trained via backpropagation through time, enabling adaptive corrections without manual covariance calibration.

We evaluate the framework using hourly data and benchmark against an optimized Unscented Kalman Filter (UKF). The proposed method matches UKF performance at short lead times but shows increasing advantages for longer horizons, consistent with improved control of error accumulation under non-stationary errors. Results demonstrate that the proposed method achieves superior forecast accuracy at the 24-hour lead time (NSE ≈ 0.75), surpassing the UKF benchmark. Crucially, even though both models were optimized primarily for short-term updates, KalmanNet exhibits superior stability in extended rollouts.The results suggest that learning the assimilation gain within a physically based model provides a robust pathway for hydrological DA under complex, state-dependent error dynamics while preserving process constraints through explicit model equations.

How to cite: Sun, Y., Tian, X., Moradkhani, H., Li, Q., Shi, P., Qu, S., and Chen, Q.: Learning the Kalman Gain: An End-to-End Deep Learning–Kalman Filter Hybrid Framework for Hydrological State Updating, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8698, https://doi.org/10.5194/egusphere-egu26-8698, 2026.

09:05–09:10
Spatial Learning
09:10–09:20
|
EGU26-8409
|
ECS
|
On-site presentation
Colin Fenster, Adrienne Marshall, Soutir Bandyopadhyay, and Daniel McKenzie

Automated Snow Telemetry (SNOTEL) networks provide critical hydrologic data with broad global socioeconomic, political, and environmental impacts. In the Western United States and European Alps, Snow Water Equivalent (SWE) is the backbone of the agricultural industry in addition to being a key source of municipal drinking water, making SWE forecasting critical for water policy and management as climate change alters year-to-year accumulation. While process-based snow models have long been used to predict SWE, machine learning approaches have risen to prominence in recent years due to their strong performance relative to observations.

SNOTEL measurements exhibit strong spatial (among neighboring sites) and temporal (day to day) correlations. However, despite the use of modern high-parameter approaches to explain spatial relationships, deep learning methods for SWE prediction fail to account for patterns among proximate locations, thus yielding inaccurate SWE predictions. We propose a novel approach to this problem by first using a Gaussian whitening process to remove spatial correlation from SWE measurements, static station features, and meteorological forcings before leveraging deep learning for temporal prediction; specifically, we train a Long Short-Term Memory (LSTM) model to learn SWE seasonality. This allows the LSTM to learn a clean temporal signal at each location without needing to implicitly approximate the underlying spatial covariance structure. After prediction, we re-introduce spatial dependence through the inverse of the whitening transformation, yielding spatially sound SWE estimates consistent with the original covariance.

The separation of spatial and temporal components makes this model more accurate than previous LSTM and high-parameter methods: we show our low-parameter process attains 22% better predictive success than the daily climatology baseline using Root Mean Squared Error (RMSE) and exceeds predictive accuracy of modern attention-based models with more than 92% of SNOTEL stations achieving Nash-Sutcliffe Efficiency (NSE) values greater than 0.5 while surpassing mean/median NSE of previous field-leading LSTM approaches. The success of this approach for point estimation provides a novel method for SWE accumulation forecasting on subseasonal scales or projecting SWE with future climate change data while motivating and supporting future work in predicting a large-scale, spatiotemporally complete SWE map.

How to cite: Fenster, C., Marshall, A., Bandyopadhyay, S., and McKenzie, D.: Spatiotemporal Deep Learning for Snow-Water Equivalent Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8409, https://doi.org/10.5194/egusphere-egu26-8409, 2026.

09:20–09:30
|
EGU26-21095
|
ECS
|
On-site presentation
Marcela Antunes Meira and Yunqing Xuan
Deep learning approaches are increasingly being used in hydrological modelling due to their ability to represent the nonlinear relationships that characterise rainfall–runoff processes. Despite this growing interest, their use for improving hydrological understanding remains limited. In particular, issues related to interpretability, spatial attribution, and model robustness persist, especially in catchments with sparse or uneven data coverage. Moreover, many deep learning applications represent catchments in a lumped manner, making it difficult to identify how different subcatchments contribute to flood generation. This study investigates spatial runoff contributions during flood events by representing hydrological connectivity with graph neural networks (GNNs). The graph-based rainfall–runoff modelling framework is applied to the Upper Medway catchment (~220 km²), located south of London. The catchment is conceptualised as a directed graph, where the nodes are represented by 34 subcatchments, generated from a digital elevation model, alongside their static features (area, slope, land use), and the edges encode the downstream hydrological connections of the river network. Rainfall inputs are aggregated at the subcatchment scale from 10 rain gauges using sub-hourly (15min) data, while sub-hourly discharge observations from two gauging stations provide the basis for model training and evaluation. Additionally, the model's robustness and information redundancy were explored through a sensitivity analysis involving the omission of certain rainfall gauges. Finally, the model behaviour is assessed through event-based simulations and compared to established hydrological modelling approaches in the catchment. Instead of focusing on predictive accuracy, the aim of this study is to investigate the learned graph representations, especially how information from upstream subcatchments propagates through the network and influences simulated responses at the catchment outlet. The limitations related to data resolution, event definition, uncertainty representation, and transferability are discussed, and future work will focus on refining model architecture and addressing evaluation strategies.

How to cite: Antunes Meira, M. and Xuan, Y.: Exploring Spatial Contributions to Flood Generation Using Graph Neural Networks: A Case Study on the Upper Medway Catchment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21095, https://doi.org/10.5194/egusphere-egu26-21095, 2026.

09:30–09:40
|
EGU26-15643
|
ECS
|
On-site presentation
Claudia Corona, Henry Johnson, Daniel Philippus, and Terri Hogue

Predicting stream water temperature (SWT) under non‑stationary hydroclimatic conditions is essential for ecosystem management yet remains challenging for deep learning applications in hydrology due to spatially structured network processes and disturbance‑driven variability. We present a graph‑informed deep learning framework that combines Long Short‑Term Memory (LSTM) networks with message passing and multi‑head attention to jointly capture temporal dynamics and upstream connectivity for daily SWT forecasting. 

Sagehen Creek, a snowmelt‑dominated montane watershed in the northern Sierra Nevada (California, United States, U.S.), served as a benchmark for evaluating robustness in climate‑sensitive mountain systems. Its pronounced seasonality, groundwater influence, and sensitivity to climate variability provide an ideal setting to assess model robustness in underrepresented montane systems and demonstrate practical scalability to larger river networks. The architecture integrates shared LSTM layers for temporal feature extraction with a graph‑based message‑passing module that weights upstream contributions via multi‑head attention. Inputs include meteorological drivers (air temperature, precipitation, solar radiation), land cover, elevation, and seasonality (day of year), derived from long‑term observations and national datasets. Hyperparameters were tuned using Bayesian methods to improve model accuracy and reliability. Applied to Sagehen Creek and thousands of gages across the U.S., the model achieves strong performance in gaged settings (RMSE ≈ 1.32°C) and maintains comparable skill in ungaged scenarios (RMSE ≈ 1.35 °C), demonstrating generalization across heterogeneous basins. Explicit representation of seasonality improves predictions of extremes, and attention weights provide insight into upstream influence. 

Overall, this work advances deep learning in hydrology by introducing a scalable, network‑aware architecture suited to non‑stationary conditions, employing structured training methods to improve reliability, and enabling ungaged predictions with minimal reliance on local observations. These results demonstrate the potential for network‑aware deep learning approaches to support more flexible and transferable hydrologic prediction strategies as environmental conditions evolve. Future work aims to include systematic comparisons with traditional statistical models to better contextualize performance gains and clarify where deep learning provides distinct advantages for SWT forecasting. 

How to cite: Corona, C., Johnson, H., Philippus, D., and Hogue, T.: Deep Learning the River Network: Message-Passing LSTMs for Robust Stream Water Temperature Prediction , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15643, https://doi.org/10.5194/egusphere-egu26-15643, 2026.

09:40–09:45
Keynote
09:45–10:15
|
EGU26-15929
|
solicited
|
Highlight
|
On-site presentation
Alden Keefe Sampson

Our community has defined its expertise by the ability to construct, calibrate, and refine complex modeling processes. Now, agentic coding tools and off-the-shelf time series foundation models are dramatically reducing the time and effort required to produce a streamflow model or prediction. As we enter the age of AI, where should human hydrologists focus, and what is our role in the modeling process?

We are moving away from the increasingly commodified nuts and bolts of model building and toward a role defined by scientific judgment. While this shift implies the loss of an aspect of our jobs many of us love, the roles that remain are increasingly impactful and important, and perhaps even more fun. I argue that two roles are becoming central and share examples of how they are already being practiced effectively. First, precise problem definition and success criteria: what should we create, and how do we know if it worked? Second, bridging users and science: assessing model fitness for use, mapping societal water problems to available solutions, and helping decision makers synthesize a proliferation of data.

As other aspects of our work become faster, this talk will highlight skills like modeling intuition, clear specification writing, data curation, and technical communication, and discuss how hydrologic scientists can build strength in areas that will maximize impact in the dynamic years ahead.

How to cite: Sampson, A. K.: The Hydrologic Modeler’s Evolving Role in the Age of AI, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15929, https://doi.org/10.5194/egusphere-egu26-15929, 2026.

Coffee break
Chairpersons: Frederik Kratzert, Basil Kraft, Eduardo Acuna
10:45–10:50
Agents and Foundational Models
10:50–11:00
|
EGU26-3440
|
ECS
|
On-site presentation
Sanika Baste, Sebastian Lerch, Daniel Klotz, and Ralf Loritz

Deterministic model predictions can struggle to adequately capture extreme events such as floods and droughts, which are of particular relevance in hydrology. This limitation arises because deterministic models collapse the conditional runoff distribution to a single point estimate. Probabilistic modeling provides a promising way to address this issue by explicitly representing uncertainty and assigning non-zero probabilities to a range of possible outcomes, including rare and extreme events, thereby capturing the full range of plausible hydrological responses. Motivated by this perspective, we investigate how long short-term memory (LSTM) based probabilistic models can be used for rainfall–runoff simulation across Switzerland. 

Overall, the probabilistic models show good calibration, although some miscalibration remains at the extremes. Differences between models mainly manifest in how uncertainty is distributed: some approaches produce narrower but lighter-tailed distributions, while others yield broader distributions with heavier tails. These trade-offs highlight that probabilistic models differ not only in sharpness but also in how they represent extreme outcomes. We also observe this trade-off in terms of the models’ single-point accuracy metrics. When evaluating the mean of the probabilistic predictions using the Nash–Sutcliffe efficiency (NSE), none of the probabilistic approaches outperform the deterministic LSTM in terms of average predictive accuracy. However, a clear advantage emerges when focusing on the tail of the discharge distribution. For the most extreme events (top 0.1% of the sorted discharge values), the deterministic LSTM underestimates more than 90% of observed values (since it provides estimates of an expectation), whereas probabilistic predictions can capture a substantially larger fraction of these extremes within their upper predictive bounds. 

Building on the additional information provided by probabilistic runoff predictions, we further show how such forecasts can be translated into discrete and actionable flood warnings using reinforcement learning. To this end, we introduce a Flood Risk Communication Agent (FRiCA) that operates on probabilistic runoff predictions and learns decision rules for issuing warnings of varying intensity. The FRiCA is implemented as an LSTM-based policy network and is trained by rewarding correct warning levels while penalizing the underestimation of flood severity. Results indicate that the FRiCA outperforms simple fixed heuristics, such as issuing warnings based on the predictive mean or a fixed high quantile (e.g., the 99th percentile). While this behavior already demonstrates the potential of reinforcement learning for improved flood risk communication, it also motivates future work toward more flexible and context-dependent decision strategies that adapt to varying hydrological and societal contexts.

How to cite: Baste, S., Lerch, S., Klotz, D., and Loritz, R.: Improving Flood Prediction and Warning through Probabilistic Deep Learning and Reinforcement Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3440, https://doi.org/10.5194/egusphere-egu26-3440, 2026.

11:00–11:10
|
EGU26-16720
|
ECS
|
On-site presentation
Baoying Shan, Qingyi Yang, Jia Feng, Shunan Zhou, Xun Zhang, Xudong Zhou, Haiqing Pu, Siqian Qiu, Yongkang Xu, Xu Shan, Xiaoyi Dong, Nuo Lei, Haiyang Qian, Bing Li, and Carlo De Michele

The rapid advancement of Large Language Models (LLMs) has triggered transformative changes across many domains, yet their application in operational hydrology forecasting remains largely unexplored. This raises a question: can LLMs meaningfully support frontline hydrological practice?

Flood forecasting provides an ideal testbed for this question. In operational settings, real-time forecasting relies heavily on forecasters' subjective judgment: interpreting meteorological patterns, assessing antecedent soil moisture, and making rapid decisions under deep uncertainty. While numerical hydrological models provide quantitative process simulations, the systematic and scalable cognitive expert judgment component still remains challenging. Moreover, operational demands for around-the-clock availability and consistent quality challenge the limited labour capacity.

Building on recent LLM advances, we present an intelligent flood forecasting agent that bridges this gap. The system integrates LLM reasoning capabilities with structured hydrological workflows, combining professional reproducibility with adaptive flexibility. A natural language interface enables forecasters to interact using everyday expressions, substantially lowering adoption barriers. The agent is currently undergoing systematic testing in a representative catchment. Preliminary results demonstrate promising consistency and robustness.

 

How to cite: Shan, B., Yang, Q., Feng, J., Zhou, S., Zhang, X., Zhou, X., Pu, H., Qiu, S., Xu, Y., Shan, X., Dong, X., Lei, N., Qian, H., Li, B., and De Michele, C.: Is it ready to apply Large Language Models to frontline hydro practice? Taking the flooding forecasting agent as an example, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16720, https://doi.org/10.5194/egusphere-egu26-16720, 2026.

11:10–11:20
|
EGU26-11131
|
On-site presentation
Fedor Scholz, Uwe Ehret, and Anneli Guthke

The recent success of large language models stems in part from the foundation model approach. Foundation models are trained to learn general representations that can be adapted to a range of downstream tasks with little to no retraining. Not having to train a new model from scratch for each task saves resources and accelerates scientific discovery. In this contribution, we present a foundation model approach for probabilistic multivariate geoscientific time series modeling. The proposed neural network architecture learns the joint distribution of multivariate hydrological time series data. This is achieved by training the model to infer subsets of target variables from subsets of predictor variables in an alternating manner. Thereby, the model learns to generate conditional predictions of any involved variable from whatever variables are available. This includes the standard task of predicting discharge from precipitation, but also allows backward inference of variables upstream in the causal pathway. Such anticausal modeling is inherently uncertain. Our approach acknowledges this by its probabilistic variational inference design. We train and evaluate our model on a detailed, heterogeneous, real-world hydrological dataset. We investigate the model's ability to capture dependencies among multiple time series and to accurately reconstruct missing variables with calibrated uncertainty estimates. Furthermore, we compare the performance of our open-purpose model to that of multiple traditional single-purpose models trained for specific inference tasks. Our results suggest that the foundation model approach is feasible in hydrology and allows resource-efficient modeling across diverse inference tasks.

How to cite: Scholz, F., Ehret, U., and Guthke, A.: Ask me anything: Toward open-purpose modeling in hydrology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11131, https://doi.org/10.5194/egusphere-egu26-11131, 2026.

11:20–11:30
|
EGU26-19179
|
On-site presentation
Qingsong Xu, Jonathan L. Bamber, Paul Bates, and Xiao Xiang Zhu
Accurate prediction of climate-driven land–surface responses is crucial for effective natural resource management, hazard mitigation, and adaptation to growing societal pressures. Existing environmental models, including both process-based approaches and task-specific machine learning methods, often exhibit limited spatial transferability due to sparse observations, structural rigidity, or sensitivity to non-stationary climate conditions. Recently, foundation models have demonstrated emergent capabilities that surpass those of task-specific systems, offering a unified paradigm adaptable to diverse Earth surface processes. However, most existing Earth foundation models (e.g., TerraMind, Prithvi, DOFA, Pangu, and Aurora) primarily scale model size without adequately addressing computational efficiency or embedding the intrinsic physical laws within the large data.

We introduce EarthDynamics, a physics-consistent foundation model for learning Earth surface dynamics that integrates physical priors with computational efficiency. EarthDynamics comprises three interrelated components. First, multi-modal encoding schemes are developed to jointly represent dynamic meteorological forcings, such as precipitation and temperature, and static geophysical attributes, including watershed properties and terrain characteristics. Second, a physics-consistent Transformer architecture is designed to explicitly embed physical constraints, including conservation laws and first-order derivatives, within the pretraining framework, thereby enhancing generalization, improving computational efficiency, and reducing dependence on large training datasets. Third, task-specific head networks enable multi-scale and multi-task inference of key environmental variables, including water levels, streamflow, and landslide occurrence.

Through the integration of these components, EarthDynamics provides a unified and extensible framework for process-informed forecasting across Earth surface systems. The model demonstrates robust performance across a wide range of dynamic tasks, including spatiotemporal simulations of geodynamic processes (e.g., shallow water equations and the Navier–Stokes equations), as well as real-world applications such as flood dynamics, landslide dynamics, rainfall–runoff process, and soil moisture forecasting. EarthDynamics consistently outperforms state-of-the-art supervised learning approaches and fine-tuned vision-based foundation models. EarthDynamics has the potential to serve as foundational infrastructure for water resource management, flood risk assessment, and environmental protection, enabling reliable and scalable predictions under climate change from regional to global scales.

How to cite: Xu, Q., Bamber, J. L., Bates, P., and Zhu, X. X.: A Physics-consistent Foundation Model for Learning Earth Surface Dynamics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19179, https://doi.org/10.5194/egusphere-egu26-19179, 2026.

11:30–11:35
Interpretability, Explainability, and Diagnostics
11:35–11:45
|
EGU26-13955
|
On-site presentation
Hans Korving

Deep learning models are increasingly used for operational river discharge forecasting, yet it remains unclear which hydrological processes their internal representations actually encode. Here, we show that high forecast skill can arise even when hydrological routing dynamics are statistically hidden in raw discharge time series and therefore not learnable by LSTM models.

In a multi-station river network, we find that the discharge field is overwhelmingly dominated by a synchronous storage (“bathtub”) mode, while routing-related variability is confined to weak components that are masked by noise and synchrony. Inter-station delays are small relative to this dominant variability, causing propagation signals to be effectively indistinguishable in the raw time series.

We demonstrate this using a sequence of pre-model diagnostics. Principal component analysis (PCA) shows that nearly all variance is explained by the synchronous storage mode. Cross-correlation analysis and signal-to-noise ratio (SNR) diagnostics confirm that routing signals have low visibility relative to dominant low-frequency variability. When the data are transformed into an innovation representation using a vector autoregressive (VAR) model, routing-related structure becomes more apparent, indicating that it is masked rather than absent.

Consistent with these data-space constraints, LSTM models trained on raw discharge time series achieve high predictive skill by exploiting short-term correlations and high-SNR inputs rather than learning propagation dynamics. SHAP attribution analysis shows that the same correlation-driven features dominate predictions across all forecast horizons, with increasing attribution at longer lead times reflecting growing uncertainty rather than newly learned hydrological structure. More generally, this implies that claims of physical learning by data-driven models require that the relevant dynamics are statistically identifiable in the data; model complexity and interpretability cannot recover processes that are masked by dominant variability.

These results demonstrate a clear separation between predictability and learnability: when synchronous variability dominates the data, routing dynamics are statistically inaccessible to sequence models trained on raw time series. They highlight a common but often implicit assumption in recent machine-learning application, that the chosen data representation already exposes the relevant physical structure. In Earth system applications, this assumption frequently fails. Without pre-model identifiability checks, increasing architectural complexity primarily reinforces dominant shortcuts rather than revealing new process information, leading to models that are inherently sensitive to distributional shift and brittle under non-stationary conditions.

How to cite: Korving, H.: High Skill, Shallow Learning: Why Hydrological Routing Is Not Learnable from Raw Time Series by LSTMs, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13955, https://doi.org/10.5194/egusphere-egu26-13955, 2026.

11:45–11:55
|
EGU26-20642
|
ECS
|
On-site presentation
Buse Onay and Stefan Kollet

The Earth system is characterized by complex, nonlinear interactions where the combination of multiple drivers can lead to extreme or compound events with significant impacts. While traditional statistical methods often struggle to capture these multivariate dependencies, deep learning models have emerged as powerful tools for forecasting hydro-climatic time series. However, their utility in Earth system science is currently limited by a lack of transparency. While  ML/DL is useful in predicting extremes, the explainability of the physical mechanisms or compound drivers is limited. Furthermore, standard interpretability techniques applied to geophysical data are often misleading, as they tend to highlight dominant seasonal cycles rather than the dynamic, event-specific interactions that are crucial for scientific discovery. This research proposes a diagnostic framework that repurposes the internal decision-making process of an attention-based encoder-decoder LSTM as a hypothesis generation tool, specifically targeting the latent drivers of extreme and compound events, exemplified here by drought. 

Using multivariate Terrestrial System Modeling Platform simulation data, we trained an attention-based encoder-decoder LSTM where 14 climatological variables serve as both input features and prediction targets in round robin training experiments, generating a comprehensive 14×14 matrix of target-specific attention maps. To transition from predictive modeling to physical interpretation, we apply a post-hoc analysis pipeline to deseasonalize the model's attention weights, which effectively filters out the model’s background behavior, isolating time periods of anomalies. We hypothesize that these anomalies, specifically the extreme 1st and 99th percentile attention, signal instances where standard linear relationships break down. This forces the model to rely on complex, transient feature interactions to maintain predictive accuracy. 

In order to understand the complex dynamics of these events and to disentangle the driving factors from the resultant effects, we employed stacked time series visualisations with multi-scale event windows (±15 and ±90 days). We compared the attention anomalies directly against the anomalies from the simulation results. This granular approach identified distinct attention signatures, revealing dynamic shifts in feature importance, such as an increased focus on surface sensible heat flux and pressure, which were specific to anomalous periods. While our analysis is mainly focused on drought evolution, these synchronized shifts suggest a capacity to reveal the multi-driver interactions of compound events. Consistent patterns across historical events demonstrate that the model’s reliance on specific inputs spikes significantly during these windows, effectively isolating potential compound drivers. By pinpointing exactly when and where system dynamics shift, this framework transforms the LSTM from a passive predictor into an active tool for scientific discovery. It provides domain scientists with targeted starting points for studying the physical precursors of compound climate extremes.

How to cite: Onay, B. and Kollet, S.: Deseasonalized Attention for Scientific Discovery of Extreme and Compound Climate Events, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20642, https://doi.org/10.5194/egusphere-egu26-20642, 2026.

11:55–12:05
|
EGU26-13847
|
ECS
|
On-site presentation
Ara Bayati, Ali A Ameli, and Saman Razavi

Deep learning rainfall–runoff models can achieve high predictive accuracy, yet still rely on correlation-driven shortcuts that are not defensible as catchment-scale mechanisms. This raises a central question: how far can correlation-driven learning be trusted to produce simulations that are hydrologically realistic, not just statistically accurate? To address this, we evaluate functional realism, defined as the extent to which a model’s internal functioning aligns with defensible mechanisms of streamflow generation. We propose a hydrology-specific Explainable AI (XAI) framework that extracts nonlinear, lag-dependent, time-varying impulse response functions (IRFs) describing how an LSTM internally maps isolated impulses in precipitation (P), temperature (T), and PET to simulated streamflow. Applied to 672 North American catchments where the LSTM demonstrated strong predictive skill, the IRFs reveal systematic functional inconsistencies masked by accuracy: in over 70% of rain-dominated catchments, short-term rises in T are associated with increased simulated streamflow and enhanced celerity even without rainfall; in snow-dominated catchments, PET is frequently treated as a proxy driver of snowmelt-related flow. We then discuss plausible origins of spurious functional learning, including seasonal confounding, heterogeneous regime mixing during training, simplicity bias (shortcut learning), and omitted drivers or missing processes. We also outline practical routes to reduce spurious learning by directly addressing these sources through input handling, regime-aware training, and targeted model adjustments.

How to cite: Bayati, A., Ameli, A. A., and Razavi, S.: Interrogating the Functional Realism of Deep Learning Rainfall–Runoff Models: Diagnostic Insights and Mitigation Strategies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13847, https://doi.org/10.5194/egusphere-egu26-13847, 2026.

12:05–12:25
|
EGU26-3597
|
solicited
|
On-site presentation
James Kirchner

Deep learning models, such as long-short-term-memory (LSTM) networks, are becoming widely adopted as the tool of choice for rainfall-runoff forecasting, reflecting their impressive performance in goodness-of-fit tests.  Nonetheless it remains unclear exactly how this impressive performance is achieved, and concerns have been raised regarding the functional realism embedded in such models (Bayati et al., 2026) and their ability to extrapolate beyond the range of their training data (Baste et al., 2025).  An underlying problem (with both machine learning models and conventional mechanistic models) is that they are trained and tested almost exclusively using goodness-of-fit measures relative to observed discharge time series.  Such goodness-of-fit tests emphasize some aspects of model behavior but obscure others.

 

Thirty years ago, Kirchner et al. (1996) proposed a more diagnostic approach to model evaluation, in which the relationships of primary interest are statistically extracted from both the model behavior and the real-world data, and then compared.  When carefully done, this can highlight relationships of interest between the relevant forcing factors and outcome variables.  Here I illustrate this approach by comparing LSTM behavior with real-world rainfall-runoff relationships, using nonlinear and nonstationary impulse response functions from Ensemble Rainfall-Runoff Analysis (ERRA).  These impulse response functions are analogous to classical unit hydrographs, but with the important distinction that they can depend nonlinearly on precipitation intensity and antecedent wetness or other time-varying attributes.  They serve as dynamic fingerprints of how measured and modeled streamflows respond to precipitation, and how that response is shaped by ambient conditions and catchment characteristics.  Examples of this approach, and insights derived from it, will be presented.

How to cite: Kirchner, J.: Assessing behavior and performance of deep learning models using dynamic fingerprints of hydrological behavior, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3597, https://doi.org/10.5194/egusphere-egu26-3597, 2026.

12:25–12:30

Posters on site: Tue, 5 May, 16:15–18:00 | Hall A

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Tue, 5 May, 14:00–18:00
Chairpersons: Maria-Luisa Taccari, Eduardo Acuna, Riccardo Taormina
A.56
|
EGU26-19178
|
ECS
Basil Kraft, Martina Kauzlaric, Aeberhard William H., Zappa Massimiliano, and Gudmundsson Lukas

We introduce DROP (Deep Runoff Prediction and Propagation), a scalable deep learning framework for spatially distributed runoff simulation and river routing across large hydrological networks. Reliable representation of runoff generation and streamflow propagation is critical for hydrological forecasting and water resources management, yet remains computationally challenging at high spatial resolution. DROP addresses this challenge by jointly learning runoff dynamics and downstream flow propagation within a single, spatially explicit modeling framework.

The model is trained on daily discharge observations from 273 gauged catchments in Switzerland, covering more than 22,000 drainage units at approximately 2 km² resolution. Using static drainage unit attributes and meteorological forcings, DROP predicts local runoff and routes flow through the river network. The architecture is designed for computational efficiency and generalization across very diverse hydrological regimes, enabling domain-wide simulations without basin-specific calibration.

Evaluation across multiple spatial experiments shows that DROP substantially outperforms baseline deep learning models (lumped LSTMs), achieving relative improvements of up to 60 % in discharge performance metrics (Kling–Gupta Efficiency; KGE) for catchments not seen during training. The model enables rapid inference, allowing simulation of daily discharge over the full domain within seconds on a single GPU. These results demonstrate that spatially explicit deep learning models can provide accurate, efficient, and scalable alternatives to traditional hydrological models for large-scale runoff simulation and river routing, with strong potential for integration into operational forecasting and Earth system modeling frameworks.

How to cite: Kraft, B., Kauzlaric, M., William H., A., Massimiliano, Z., and Lukas, G.: A scalable, spatially distributed approach to runoff simulation and river routing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19178, https://doi.org/10.5194/egusphere-egu26-19178, 2026.

A.57
|
EGU26-4532
|
ECS
Inmaculada González Planet and Carmelo Juez

The use of artificial neural networks (ANNs) in hydrological modelling has gained increasing popularity due to their ability to represent non-linear relationships and complex system dynamics.  In particular, Long Short-Term Memory (LSTM) networks have become the state-of-the-art approach for streamflow simulations, as they incorporate memory cells and gating mechanisms capable of learning both short- and long-term dependencies. However, standard LSTM models have limitations for predicting extreme high-flow events and suffer from limited interpretability due to their lack of explicit physical grounding.

The Mass-Conserving LSTM (MC-LSTM) is a variant of the standard LSTM architecture designed to address the lack of physical consistency by embedding mass conservation directly into the internal model structure. Hence, the information stored in MC-LSTM cell states is expected to correspond more directly to hydrological processes contributing to the basin water balance.

This study analyses and compares the internal processes of standard LSTM and MC-LSTM networks trained on four snowmelt-dominated watersheds located in the Central Spanish Pyrenees. We first evaluate the ability of both models to conserve water volume, showing that the MC-LSTM maintains volumetric consistency due to the imposed physical constraint, whereas the standard LSTM exhibits substantial discrepancies between observed and simulated volumes. We then investigate the learning behaviour of the MC-LSTM using two independent physical datasets not included as model inputs: snow and evapotranspiration (ETO), both of which play a key role in the local water balance. Using a wavelet-based methodology, snow-cells and ETO-cells are identified within the MC-LSTM cell state. Snow-cells exhibit Pearson correlations exceeding 0.5 across all watersheds, while ETO-cells reproduce the observed variability despite low temporal correlation. Furthermore, ETO-cells show a limited contribution to the model output, consistent with their physical role as water losses.

Overall, this analysis highlights the limitations of standard LSTM models in representing volumetric consistency and physical conservation processes, while demonstrating the enhanced physical interpretability of the MC-LSTM architecture, which achieves comparable or superior performance to standard LSTM models while preserving hydrological coherence.

Acknowledgments: This work is funded by the European Research Council (ERC) through the Horizon Europe 2021 Starting Grant program under REA grant agreement number 101039181-SED@HEAD.

How to cite: González Planet, I. and Juez, C.: Interpreting LSTM and MC-LSTM internal states with hydrological physics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4532, https://doi.org/10.5194/egusphere-egu26-4532, 2026.

A.58
|
EGU26-10485
Andras Bardossy and Ralf Loritz

Long Short-Term Memory (LSTM) networks are widely used for rainfall–runoff modelling and have demonstrated strong performance in regional applications. A key advantage of LSTMs is their ability to learn from large samples of catchments, in contrast to traditional approaches that rely on individual catchment-by-catchment calibration. The objective of this abstract is to assess the robustness of regional LSTM models with respect to their behaviour at the individual catchment scale. To this end, ensemble training and simulation experiments were conducted using the CAMELS-GB and CAMELS-US datasets. An identical LSTM architecture was trained 100 times with different random weight initializations, and model performance was evaluated separately for each catchment. For a substantial number of basins, model performance varied strongly across realizations, with considerably larger variability observed for the CAMELS-US dataset. Excluding catchments with known data quality issues or highly nonlinear responses led only to minor improvements and a modest reduction in performance spread. Furthermore, large differences between validation and test performance were frequently observed, indicating that model skill is often not stable across evaluation periods for individual catchments. The results indicate that uncertainty estimates derived from ensembles of random initializations appear overconfident and do not reflect the full epistemic uncertainty.

How to cite: Bardossy, A. and Loritz, R.: Assessing robustness and uncertainty in rainfall–runoff modelling using LSTM ensembles, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10485, https://doi.org/10.5194/egusphere-egu26-10485, 2026.

A.59
|
EGU26-15984
Jasper Vrugt and Jonathan Frame

Gradient-based methods are increasingly used in hydrologic model calibration, data assimilation, and hybrid physics–machine learning frameworks. However, most existing approaches rely on finite differences, automatic differentiation, or surrogate emulators, which are computationally expensive, memory-intensive, and sensitive to numerical noise, especially for long time series and nontraditional objective functions. We present a general framework for exact, scalable gradient computation in conceptual hydrologic models based on analytic forward sensitivity equations. By augmenting the governing ODEs with sensitivity states, a single model integration simultaneously yields hydrologic states, fluxes, and the full parameter Jacobian. These sensitivities are independent of the objective function, enabling exact gradients for any differentiable loss, including least squares, absolute residuals, NSE, KGE, flow-duration-curve metrics, and robust M-estimators, without re-running the model or invoking automatic differentiation. We implement this approach in a suite of widely used conceptual models (including HBV, HYMOD, Hmodel, GR4J, SAC-SMA, and Xinanjiang) within a unified computational framework with a high-performance C++ core and MATLAB/Python interfaces. We demonstrate its scalability using a large-sample experiment based on the CAMELS data set, comprising 671 catchments across the contiguous United States. Compared to automatic and numerical differentiation, our approach reduces calibration times from hours to minutes while improving numerical stability, convergence behavior, and interpretability. This work establishes analytic forward sensitivities as a transparent, physics-consistent, and computationally efficient foundation for large-sample hydrology and process-based model learning.

How to cite: Vrugt, J. and Frame, J.: Exact and scalable gradient-based learning of conceptual hydrologic models using analytic forward sensitivities, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15984, https://doi.org/10.5194/egusphere-egu26-15984, 2026.

A.60
|
EGU26-746
|
ECS
Pattabiraman Balasundaram and Kasiapillai S Kasiviswanathan

Quantifying reliable uncertainty information in streamflow forecasts is essential for informed decision-making in water resources management and operation. Conventionally deterministic forecasts often fail in decision accuracy and overlook aleatoric uncertainty in the nonstationary hydrological behavior. While hydrological models (conceptual, process-based, empirical) can represent the underlying physical processes, deep learning models come with higher forecast accuracy, substituting the complex processes through complex neural structure. This paper presents a hybrid deep learning (DL) approach to construct reliable prediction intervals (PI) for streamflow predictions optimized through two novel objective functions. The paper applied the variational mode decomposition (VMD) technique on the target streamflow information to capture the underlying nonstationary feature and thus achieve improved predictive accuracy. Subsequently, prediction of each decomposed model are reconstructed using constrained particle swarm optimization (PSO). The developed approach is tested using Long Short-Term Memory (LSTM) model in Contiguous United States (CONUS) under various hydrological setting: i) PI-LSTM with dual objective functions (with and without Data Integration), ii) PI-LSTM-VMD with dual objective functions (with and without Data Integration). The proposed frameworks have yielded reliable predictions achieving median Nash Sutcliffe efficiency (NSE) 0.91 and 0.87 for PI-LSTM (with Data Integration) and PI-LSTM-VMD (with Data Integration) respectively along with the median coverage probability over 90% in both cases. The performances were robust across the basins with relatively minimum prediction width (relative average width) under 0.9 in both cases. Although the LSTM networks are largely beneficial with data integration (DI), the proposed frameworks showed relatively poor performance without DI which further emphasis the necessity to look on the guiding the deep learning models with promising data inputs.

How to cite: Balasundaram, P. and Kasiviswanathan, K. S.: A Hybrid Deep Learning Approach: Constructing prediction intervals for Streamflow forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-746, https://doi.org/10.5194/egusphere-egu26-746, 2026.

A.62
|
EGU26-7509
|
ECS
Ashish Kumar, Sonja Jankowfsky, Edom Moges, Arno Hilberts, Shuangcai Li, and Anongnart Assteerawatt

Machine learning (ML) techniques are transforming hydrological modeling, yet their ability to predict extreme streamflow events remains uncertain. Among these techniques, Long Short-Term Memory (LSTM) networks have emerged as a powerful tool for streamflow prediction, capable of capturing complex temporal dynamics and long-term dependencies inherent in hydrological data. In this study, we aim to identify the factors that influence the upper limit of discharge values simulated by LSTM models—a critical aspect for improving extreme event prediction. This limit is shaped by multiple considerations, including the diversity and quality of training data, model architecture, and optimization objectives. Data preprocessing and calibration strategies further impact performance, while challenges such as input biases and insufficient emphasis on rare events can constrain the model’s ability to capture extremes. Ultimately, predictions remain bounded by physical laws and theoretical principles, ensuring outputs are credible and consistent with real-world hydrological behavior. Understanding these factors provides valuable insights for enhancing model robustness, improving flood risk assessment, and guiding the development of scalable approaches for simulating extreme hydrological events under changing climate conditions.

How to cite: Kumar, A., Jankowfsky, S., Moges, E., Hilberts, A., Li, S., and Assteerawatt, A.: Investigating Factors Influencing Upper Bound Performance of ML-Based Streamflow Simulations., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7509, https://doi.org/10.5194/egusphere-egu26-7509, 2026.

A.63
|
EGU26-15905
|
ECS
Lucy Myrol and Jan Adamowski

Long Short-Term Memory (LSTM) networks have recently emerged as a leading deep learning architecture for hydrological forecasting due to their ability to represent nonlinear and long-term dependencies in time series data. However, the selection of input variables and temporal lags for LSTM networks is often heuristic and characterized by the inclusion of all available forcings and wide lag windows. This practice can yield over-parameterized models that are prone to overfitting. Causal discovery–based feature selection offers a principled alternative to heuristic input configuration. While these methods have shown promise in improving model interpretability and generalization in statistical and machine learning contexts, their integration with deep learning architectures remains underexplored. Here, we present a workflow that integrates causal inference for time series as a preprocessing step for LSTM-based hydrological forecasting in an operational hydropower context. Using subdaily multivariate hydroclimatic time series from the Lake Erie basin in Ontario, Canada, we apply the PC-MCI algorithm to infer directed causal relationships and characteristic temporal lags among streamflow, lake levels, meteorological forcings, and hydropower-relevant predictor variables. The resulting causal graphs provide a model-agnostic, interpretable basis for defining the predictor sets and lag structures that form the input configuration of an encoder–decoder LSTM model. Ongoing work evaluates whether causally informed configurations improve forecast skill and generalization relative to conventional variable‑selection strategies and assesses the computational and operational trade-offs of the proposed workflow.

How to cite: Myrol, L. and Adamowski, J.: Causally Informed Input and Lag Selection for LSTM-based Hydrological Forecasting in the Lake Erie Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15905, https://doi.org/10.5194/egusphere-egu26-15905, 2026.

A.64
|
EGU26-1975
|
ECS
Ruixi Zhang

This study presents a robust streamflow forecasting framework based on an Encoder-Decoder LSTM architecture designed for the Dadu River Basin, a major tributary of the upper Yangtze River with a drainage area of 77,700 $km^2$ and annual precipitation increasing from 600 to 1500 mm northwest-to-southeast. The model integrates multi-source heterogeneous data, including ERA5-Land reanalysis products, local grid precipitation, and historical runoff observations. A key innovation is the State Transfer Module, which maps compressed historical catchment features into the decoder’s initial state to simulate the transformation from antecedent conditions to future runoff processes. The framework was validated across eight reservoirs on the Dadu River main stem, representing diverse regulation capacities including daily, seasonal (Houziyan), and annual (Pubugou) regulation During the 2024–2025 test period, the model achieved an average Mean Relative Error (MRE) of 18.2%, significantly outperforming traditional deterministic (24.7%) and similarity-based (21.0%) methods. Specifically, Nash-Sutcliffe Efficiency (NSE) values reached 0.89 at Houziyan and 0.88 at Pubugou, demonstrating superior skill in capturing flood peaks and recession trends. With minute-level training and second-level inference efficiency, this deep learning approach provides a reliable core technology for long-lead (10-day) operational forecasting and cascaded reservoir management

How to cite: Zhang, R.: Operational Streamflow Forecasting for Cascaded Reservoirs in the Dadu River Basin: A Deep Learning Approach Based on Encoder-Decoder LSTM and Multi-Source Data Integration, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1975, https://doi.org/10.5194/egusphere-egu26-1975, 2026.

A.65
|
EGU26-2512
Rocco Palmitessa, Connor Chewning, Jakob Luchner, and Elbys Jose Meneses

Neurohydrological models, particularly Long Short-Term Memory (LSTM) networks, are increasingly recognized as valid alternatives to conceptual and physics-based Global Hydrological Models (GHMs). Literature suggests that regionally trained and fine-tuned LSTMs typically outperform models trained exclusively on single catchments. To systematically address the benefits of different fine-tuning strategies, this study tested three approaches across 60 catchments in the Mekong basin. We subsequently compared historical LSTM forecasts with simulations from DHI’s GHM, a well-calibrated physics-based model. The objective was to generate insights regarding when data-driven models demonstrate superiority over physics-based counterparts and to identify which fine-tuning approach is most effective for this region.

The study utilized ERA5 forcing data and HydroATLAS basin properties formatted to the CAMELS standard, combined with streamflow observations from 60 non-public stations across the Mekong basin. We selected an off-the-shelf LSTM model from the NeuralHydrology package, pre-trained on the global CARAVAN dataset, and applied three distinct fine-tuning strategies: direct fine-tuning of the Global model to Local data (GL), fine-tuning to Regional data (GR), and a two-step process fine-tuning first on Regional and then on Local data (GRL). For each model, we performed a hyperparameter sweep to maximize the Kling-Gupta Efficiency (KGE). The dataset was divided into 15 years for training, followed by 5 years for validation and 5 years for testing. Performance was benchmarked against the DHI-GHM using KGE and Nash-Sutcliffe Efficiency (NSE) metrics.

Analysis indicates that the GL approach yields the highest KGE in nearly half of the basins, while the GRL approach proves superior in the remaining half; notably fine, the likelihood of GRL being the best-performing approach increases with basin area. Overall, -tuning LSTMs on both regional and local streamflow (GRL) improved performance compared to strictly regional (GR) or local fine-tuning (GL), with the median KGE increasing from 0.65 to 0.72. While this result does not fully match the overall accuracy of the DHI-GHM in the test period (median KGE of 0.75), the fine-tuned LSTM outperformed the physics-based model in all catchments with poorly described processes—such as irrigation abstraction and infiltration after overtopping—where the DHI-GHM yielded a KGE below 0.6. In well-calibrated catchments, performance was comparable. Furthermore, the performance gap narrows when expressed in NSE, as the LSTM model outperformed the DHI-GHM in terms of mean NSE, despite a lower median NSE.

These findings suggest that while calibrated physics-based models remain robust, neurohydrological approaches offer distinct advantages in representing complex or unmodeled physical processes. The study highlights that the optimal training strategy is scale-dependent, with multi-step fine-tuning providing greater benefits for larger basins. Ultimately, the ability of LSTMs to outperform traditional models in areas with complex anthropogenic or structural challenges suggests they are a vital, complementary tool for enhancing hydrological predictability.

How to cite: Palmitessa, R., Chewning, C., Luchner, J., and Meneses, E. J.: Benchmarking fine-tuning strategies for LSTM rainfall-runoff models in the Mekong basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2512, https://doi.org/10.5194/egusphere-egu26-2512, 2026.

A.66
|
EGU26-6492
|
ECS
Sandro Hunziker, Nicolas Lazaro, and Tobias Siegfried

Accurate short-term streamflow forecasts are critical for water resources management, hydropower operations, and early warning of hydrological hazards. This need is particularly pronounced in Central Asia, where water is predominantly stored as seasonal snow and glacier ice in the high mountain region and released during the warm season, sustaining extensive irrigated agriculture and hydropower production in the region.  

The Swiss Agency for Development and Cooperation supports the strengthening of the operational hydrological forecasting capabilities of National Hydrometeorological Services across Kazakhstan, Kyrgyzstan, Tajikistan, Turkmenistan, and Uzbekistan (SAPPHIRE Central Asia project). SAPPHIRE co-designs, develops, and deploys operationally open-source forecasting tools that integrate existing and machine-learning-based forecasting methods in these organizations. 

As part of this work, we evaluate the performance of three state-of-the-art time-series forecasting architectures—the Temporal Fusion Transformer (TFT), the Time-Series Dense Encoder (TiDE), and the Time-Series Mixer (TSMixer)—for operational 10-day-ahead streamflow prediction across more than 100 gauges in Kazakhstan, Kyrgyzstan, and Tajikistan. 

The models are trained jointly across all basins within each country to enhance spatiotemporal generalization. Probabilistic forecasts are produced using a quantile loss function, thereby representing aleatoric uncertainty. Model skill is assessed against observed discharge and benchmarked against periodic linear regression models for both 5-day and 10-day averaged forecasts. 

Results indicate that all three deep learning models consistently outperform the existing benchmark approaches, with particularly pronounced improvements at the  
10-day forecast horizon. For example, in Kyrgyzstan and Tajikistan, mean absolute errors get reduced by 30% - 37%. The auto-regressive information from past discharge emerges as the most influential predictor, underscoring its central role in snow- and glacier-melt-dominated runoff regimes of high-mountain Central Asia. 

These advances directly strengthen the forecasting capacity of the Central Asian Hydrometeorological Services and improve the quality of information available to their diverse user base—including national water management authorities responsible for irrigation allocation, hydropower operators optimizing reservoir releases, agencies managing climate-sensitive infrastructure such as roads and airports, and transboundary water management institutions like the Interstate Commission for Water Coordination (ICWC). By demonstrating the operational viability of modern deep learning approaches within existing institutional frameworks, this work contributes to more reliable and actionable hydrological information across one of the world's most water-stressed transboundary regions. 

How to cite: Hunziker, S., Lazaro, N., and Siegfried, T.: Operational Discharge Forecasting in Central Asia using Deep Learning , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6492, https://doi.org/10.5194/egusphere-egu26-6492, 2026.

A.67
|
EGU26-9230
|
ECS
Bilal liaqat, Tua Nylén, Ville Kankare, and Petteri Alho

As climate change accelerates, hydrological models are increasingly required to predict water resources under climatic conditions they have never seen before. While modern data-driven approaches, such as machine learning models, have shown higher accuracy in reproducing historical streamflow, their ability to generalize to unseen future climates remains a critical concern. These data driven models often learn statistical patterns that maximize performance on training data but fail when facing new weather patterns or extreme events. Current research into improving model robustness has largely focused on conceptual models in temperate, rain-dominated catchments. This leaves the applicability of these techniques unverified in high-latitude, snow-dominated catchments, such as Finland. These regions face distinct challenges, particularly the complex, non-linear processes of snow accumulation and melt. Because these processes are highly sensitive to temperature thresholds, standard data-driven models may struggle to capture them consistently when extrapolating to warmer future conditions. Furthermore, widely used stability techniques have rarely been adapted for the specific architecture of machine learning models. This study proposes to investigate whether integrating residual stability constraints, mathematical penalties that force model errors to remain consistent over time, can improve the transferability of AI models in boreal catchments. Rather than relying solely on minimizing error, we aim to explore training schemes that prioritize time-invariance, ensuring that the model's behavior does not degrade significantly between different climatic periods. We outline a framework to test these stability-based training methods on a large dataset of Finnish catchments. By comparing standard AI training against stability-constrained approaches, this research aims to determine if trading a small amount of historical accuracy can yield models that are more physically plausible and robust for future climate scenarios. This work seeks to bridge the gap between advanced machine learning techniques and the unique hydrological needs of cold-climate regions.

How to cite: liaqat, B., Nylén, T., Kankare, V., and Alho, P.: Enhancing the Robustness of Deep Learning Hydrological Models in High-Latitude Catchments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9230, https://doi.org/10.5194/egusphere-egu26-9230, 2026.

A.68
|
EGU26-12499
|
ECS
Paul Reis, Alexander Dolich, Antara Dasgupta, Paul Hassenjürgen, Sergiy Vorogushyn, Viet Dung Nguyen, and Ralf Loritz

Deep learning, especially Long Short-Term Memory (LSTM) networks, has become popular in recent years for rainfall-runoff modelling. However, recent studies show that LSTM performance is constrained by a theoretical threshold, limiting the simulation of extreme discharge events. While the internal structure of the LSTM is one contributing factor, another contributor is the limited availability and diversity of hydro-meteorological training data of extremes, as major floods only represent a small fraction of the observed data.

To mitigate the underrepresentation of extreme hydrological events in the training data, this study investigates the effectiveness of data augmentation for rainfall-runoff modelling with LSTMs. Pre-generated artificial meteorological time series from the non-stationary climate-informed weather generator (nsRWG) are used to increase the representation of extreme events in the training data. The study area covers the region of North Rhine-Westphalia, Germany, and consists of 100 alternative precipitation and temperature scenarios spanning the past 70 years. Discharge for the catchments is simulated using an HBV model based on the nsRWG outputs. By combining observed time series from the CAMELS-DE dataset with artificial samples, the training set is enriched with additional extreme events, including samples that are more extreme in magnitude than those in the observed data. This augmented dataset is used to assess whether model performance in predicting extreme events can be improved. We aim to (1) assess whether data augmentation can shift the theoretical threshold limit of the LSTM, (2) quantify this limit, (3) optimize the integration of the weather generator data during training, and (4) evaluate overall predictive performance and, in particular, whether the prediction of extreme floods improves with the augmented training data.

How to cite: Reis, P., Dolich, A., Dasgupta, A., Hassenjürgen, P., Vorogushyn, S., Nguyen, V. D., and Loritz, R.: Integration of Generated Weather Data into LSTM Training to Improve the Simulation of Extreme Flood Events, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12499, https://doi.org/10.5194/egusphere-egu26-12499, 2026.

A.69
|
EGU26-9982
|
ECS
Min Kwan Choi, Yong Oh Lee, and Dongkyun Kim

While accurate parameter estimation in physically-based hydrological models is critical, applying supervised learning for this purpose presents inherent limitations. This is because supervised learning requires parameter ground truth as labels, yet obtaining spatially complete observations of these physical parameters in real-world basins is practically impossible. To address this challenge, this study proposes a "simulation-based inverse mapping framework" capable of reconstructing the spatial distribution of physical parameters solely from flow data, without relying on observed parameter ground truth. This approach utilizes a physically-based hydrological model as a data generator. The training dataset is constructed by filtering Sobol-sequence-generated parameter candidates; only realistic combinations that satisfy physical constraints—specifically the Budyko water balance and the negative correlation between NDVI and Curve Number (CN)—are selected. Furthermore, the Cross-Entropy Method (CEM) was employed to refine the training data, optimizing for both hydrological plausibility and predictive accuracy. The developed deep learning model is trained to take observed flow time series as input and inversely predict the basin's physical parameter fields (e.g., CN, hydraulic conductivity). When applied to the test period, the model demonstrated high flow reproduction performance with a satisfactory Nash–Sutcliffe Efficiency (NSE). In conclusion, this study demonstrates that by integrating physical modeling processes with the computational power of deep learning, it is possible to effectively estimate hydrological parameters and achieve reliable runoff analysis, even in the absence of parameter ground truth.

How to cite: Choi, M. K., Lee, Y. O., and Kim, D.: Simulation-based Inverse Mapping Framework for Runoff Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9982, https://doi.org/10.5194/egusphere-egu26-9982, 2026.

A.70
|
EGU26-12266
|
ECS
Rafael Francisco and José Pedro Matos

Accurate prediction of streamflow is essential for sound water resources management but remains a complex task due to the dynamic nature of hydrological processes, imprecision in meteorological data, and measurement challenges. Recent advancements in deep learning have demonstrated the potential of data-driven models to extract and identify complex temporal dependencies in large hydro-meteorological datasets (e.g. [1-2]).

This work evaluates the ability of Temporal Fusion Transformers (TFTs) to predict daily streamflow across catchments in Mainland Portugal, using meteorological input data derived from ERA5-Land (reanalysis dataset) and geomorphological descriptors. TFTs are a relatively novel deep-learning architecture that is being explored in hydrology (e.g., [3-4]). It incorporates the well-tested Long Short-Term Memory (LSTM) architecture with transformers, potentially offering possibilities of improved performance and partial explainability of predictions.

The methodology for the application to ungauged catchments relies on straightforward cross-validation with holdout samples. Although all considered catchments are monitored by gauging stations, streamflow observations at the various locations are selectively hidden from the model during calibration and validation, allowing a full controlled emulation of ungauged conditions on the test subsets.

Model performance is benchmarked against calibrated Hydrologiska Byråns Vattenbalansavdelning (HBV) hydrological models. Results show that TFTs achieve comparable predictive skill in ungauged settings when compared to locally calibrated HBV counterparts, while providing probabilistic predictions with limited explainability.

The capability for specialization is also investigated. Indeed, it is shown that retraining a general-purpose “ungauged” TFT on a previously unknown time series, even with only a limited number of observations, can lead to substantial improvements in predictive skill.

The proposed framework offers a practical and scalable solution for streamflow estimation in data-scarce and ungauged catchments. By relying on globally available data and static catchment characteristics, the approach can be transferred to regions with limited measurement networks, reducing dependence on long-term observations. The probabilistic outputs further enhance decision-making by explicitly quantifying predictive uncertainty, a critical factor for risk-informed planning, supporting operational water resources management and early warning systems.

[1] Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, https://doi.org/10.5194/hess-22-6005-2018, 2018.

[2] Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392, https://doi.org/10.5194/hess-26-3377-2022, 2022. 

[3] Koya, S. R., Roy, T.: Temporal Fusion Transformers for streamflow prediction: Value of combining attention with recurrence. J. Hydrol., 637, 131301. https://doi.org/10.1016/j.jhydrol.2024.131301, 2024.

[4] He, M., Jiang, S., Ren, L., Cui, H., Qin, T., Du, S., Zhu, Y., Fang, X., Xu, C.: Streamflow prediction in ungauged catchments through use of catchment classification and deep learning. J. Hydrol., 639, 131638. https://doi.org/10.1016/j.jhydrol.2024.131638, 2024.

How to cite: Francisco, R. and Matos, J. P.: Probabilistic streamflow prediction in ungauged natural catchments with Temporal Fusion Transformers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12266, https://doi.org/10.5194/egusphere-egu26-12266, 2026.

A.71
|
EGU26-18516
|
ECS
Hörður Bragi Helgason and Bart Nijssen

Deep learning models based on Long Short-Term Memory (LSTM) networks are increasingly applied in rainfall runoff modeling, yet their behavior in heavily glacierized catchments remains understudied. We train a regional, lumped LSTM model for 49 glacier and snowmelt influenced catchments in Iceland using the LamaH-Ice dataset, driven by a regional atmospheric reanalysis and informed by static catchment attributes. We assess model performance in these basins, examine whether cryospheric processes are learned implicitly from streamflow alone, evaluate the role of static attributes, and test whether multitask learning with cryospheric targets improves discharge predictions.

We find that the model predicts daily streamflow robustly across most basins, achieving high skill during the test period. Model skill remains largely unchanged when physiographic attributes are randomly shuffled or replaced by simple climate statistics, but declines noticeably when static attributes are omitted. Counterfactual experiments in which glacier fraction is set to zero show summer discharge reductions that increase with the degree of glacier coverage. Using linear probes, we show that the LSTM implicitly learns signals related to remotely sensed snow cover and glacier albedo when trained only on streamflow.

We further explore a multitask learning configuration in which the LSTM is trained to predict both streamflow and satellite derived snow cover. The linear probes reveal that this setup improves the model’s internal representation of cryospheric variables but does not improve discharge predictions compared to a single task streamflow model.

Overall, we demonstrate that LSTM based hydrological models can simulate streamflow skillfully in glacierized catchments, with static catchment attributes supporting physical interpretation of model behavior. We further show that these models can internalize physically meaningful cryospheric information without explicit supervision, while highlighting limitations of multitask learning using remote sensing observations for improving streamflow predictions in glacierized catchments.

How to cite: Helgason, H. B. and Nijssen, B.: Hydrological modeling with LSTMs in glacier and snowmelt fed catchments: the role of catchment attributes and multitask learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18516, https://doi.org/10.5194/egusphere-egu26-18516, 2026.

A.72
|
EGU26-20929
|
ECS
Philipp Tanzeglock and Bora Shehu

The catastrophic July 2021 flood in the Ahr Valley (Rhineland-Palatinate, Germany) highlights the urgent need to improve our understanding and modelling of rainfall-induced megafloods. Conventional conceptual hydrological models often fail to accurately simulate flood peaks during such extreme events. Owing to their very rare occurrence, megafloods are typically absent from calibration periods, as available discharge observations are too short in time. In contrast, the spatial coverage of discharge observations is steadily increasing. Hydrologically and physiographically similar catchments may therefore provide valuable information on flood response behavior that has not yet been observed in the catchment of interest. In this study, we investigate whether spatial information can compensate for limited temporal observations by applying a long short-term memory (LSTM) neural network within the Neural Hydrology framework (Kratzert et al., 2021), which is capable of learning patterns from large datasets and transferring them to similar, yet distinct, hydrological settings.

For this purpose, in this study, we use a large dataset of catchments across Central Europe and Germany with observed discharge and meteorological data from 1970 onwards to model hourly discharge at Ahr Valley. A series of experiments is designed using different combinations of temporal coverage and sets of physiographically similar catchments to evaluate their ability to reproduce flood behavior at Ahr Valley. The methodological framework consists of two steps: (i) training the Neural Hydrology model on a set of similar catchments (excluding the Ahr catchment) using split-sample validation, and (ii) validating the trained models for extreme flood events, including the 2021 megaflood, at several Ahr sub-catchments.

By systematically comparing different configurations of spatial and temporal information, we address the following questions: Can time be successfully traded for space when simulating the 2021 Ahr megaflood? How can hydrologically similar catchments be identified most effectively? And can neural hydrology outperform the conventional conceptual models used operationally for the Ahr event (LARSIM and HBV-Light)

How to cite: Tanzeglock, P. and Shehu, B.: Can LSTM Neural Hydrology help us trade space for time in rainfall-induced megafloods?, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20929, https://doi.org/10.5194/egusphere-egu26-20929, 2026.

A.73
|
EGU26-19382
|
ECS
Sarth Dubey, Shaonli Mishra, and Udit Bhatia

River-fed urban flooding is highly sensitive to boundary inflow hydrographs, yet many cities lack streamflow gauges at the location where the river enters the city. In such settings, inundation is typically governed by a coupled chain of processes: (a) routing from the nearest upstream gauge, (b) rainfall-driven lateral inflows from the intervening catchment, and (c) shallow-water dynamics within the city. Recent advances have progressed independently in deep learning and differentiable or hybrid formulations for each component, but this raises a central question: when these models are coupled, do errors propagate transparently, or do they compensate in ways that appear accurate while remaining physically biased? A key gap is demonstrating whether end-to-end coupling improves urban flood predictions or instead amplifies uncertainty and bias across modules.

We develop a hybrid framework that couples these three modules and supports both piecewise (module-wise) training and joint end-to-end learning, enabling explicit diagnosis of error propagation. A synthetic training dataset is generated using physics-based flood simulations to provide consistent supervision for runoff generation, routing behaviour, and inundation response. Evaluation then focuses on historical flood events in Surat, Gujarat, using remote-sensing inundation extent maps as an event-scale observational benchmark. The experimental design is structured to isolate the marginal effect of coupling by tracking how uncertainties in lateral inflow and routing translate into boundary hydrograph bias and, ultimately, mismatch in predicted inundation extent.

The analysis is framed around reliability rather than raw accuracy: it investigates when end-to-end coupling reduces boundary-condition uncertainty versus when it enables compensating errors that mask upstream bias at the urban scale. By comparing independently assessed sub-modules against the coupled system, the study aims to clarify how errors accumulate across the hydrology-to-inundation pipeline and under which hydrologic regimes. This provides a pathway toward reliable and rapid end-to-end hybrid systems for river-fed urban flood modeling.

How to cite: Dubey, S., Mishra, S., and Bhatia, U.: Coupling Rainfal-Runoff and Shallow-Water Hybrid Models: A Reliability Test of Piecewise Versus End-to-End Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19382, https://doi.org/10.5194/egusphere-egu26-19382, 2026.

A.74
|
EGU26-302
Yogesh Bhattarai, Ganesh R Ghimire, and Sanjib Sharma

Floods are among the most frequent and destructive natural disasters. Accurate predictions and timely warnings are critical for mitigating flood risk. However, flood prediction remains challenging due to limited availability of high-resolution data for model calibration and validation, high computational demands for near-real-time simulations, and large uncertainties surrounding sophisticated flood inundation modeling chain. This study focuses on improving riverine flood inundation predictions by leveraging artificial intelligence and machine learning algorithms to fuse data and models, accelerate computation, and automate end-to-end predictive workflow. We develop machine learning based postprocessors to correct systematic biases in hydrodynamic model outputs by learning from historical prediction errors. We also train and evaluate a hybrid Convolution Neural Network architecture coupled with a transformer to produce high-resolution inundation maps, combining local spatial feature extraction with long-range attention mechanisms to capture watershed-scale connectivity. Finally, we construct a surrogate of the fully physics-based GPU-enabled hydrodynamic model, Two-dimensional Runoff Inundation Toolkit for Operational Needs (TRITON) to generate rapid inundation simulations. Our results highlight strong tradeoffs between model complexity (standalone, hybrid, and surrogate modeling approaches), the size and quality of training datasets, available computational resources, and overall prediction accuracy, showing the pathway toward real-time flood inundation forecasting. Improved predictions of flood inundations can provide actionable insights to enhance emergency management, reduce disaster risk, and build community resilience. 

How to cite: Bhattarai, Y., Ghimire, G. R., and Sharma, S.: Deep learning–enhanced emulation of hydrodynamic models for improved flood inundation prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-302, https://doi.org/10.5194/egusphere-egu26-302, 2026.

A.75
|
EGU26-2703
|
ECS
Bohan Huang, Wentao Li, Zhu Liu, and Qingyun Duan

Floods pose substantial risks to human society and ecosystems, making accurate flood forecasting essential for disaster mitigation and water resources management. However, reliable prediction remains challenging especially in data-scarce regions, where process-based models rely heavily on site-specific calibration and exhibit limited transferability. Here we present a unified global flood forecasting framework that combines systematic catchment attributes screening with a generative deep-learning-based probabilistic hydrological forecasting model, HydroForecast. Through importance ranking and stepwise forward feature selection, the framework first identifies a representative and non-redundant set of catchment attributes. Leveraging these attributes together with meteorological forcings, we construct the HydroForecast model, which directly learns the underlying discharge distribution and generates ensemble predictions without relying on restrictive parametric prior assumptions. Evaluated across more than 3,000 basins worldwide, HydroForecast consistently outperforms a Skewed Laplace–based LSTM benchmark, delivering more accurate flood peak prediction, improved event detection, and reliable uncertainty quantification. Additional analyses demonstrate that our model maintains stable performance in reservoir-regulated basins, while exhibiting pronounced performance differences across climate regimes that reflect the varying degrees of predictive difficulty associated with distinct hydro-climatic conditions. Together, these results highlight the strong potential and reliability of HydroForecast for large-sample flood forecasting and for improving predictive capability in ungauged regions.

How to cite: Huang, B., Li, W., Liu, Z., and Duan, Q.: HydroForecast: A Deep Learning-Based Probabilistic Flood Forecasting Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2703, https://doi.org/10.5194/egusphere-egu26-2703, 2026.

A.76
|
EGU26-7423
|
ECS
Robert Keppler, Julian Koch, and Rasmus Fensholt

Our study explores the use of Physics-Informed Neural Operators (PINOs) for solving the two-dimensional shallow water equations (2D SWE) in the context of flood modeling. In contrast to Physics-Informed Neural Networks (PINNs), which require retraining for each new set of initial or boundary conditions, PINOs learn the underlying solution operator, enabling rapid inference across a wide range of conditions without retraining.

We assess the performance of PINOs through a sequence of numerical experiments with increasing physical complexity, including a radial dam-break scenario, constant boundary conditions with and without friction, and time-dependent boundary conditions. Existing PINO frameworks were adapted and extended to accommodate these experimental settings.

The results demonstrate that PINOs can accurately capture key flood-relevant dynamics, particularly water depth, while achieving substantial computational speed-ups of up to four orders of magnitude compared to conventional numerical solvers. Relative test errors for water depth were as low as 0.3% for the radial dam-break case, increasing to 10.9% in the presence of bottom topography, 7.3% with friction, and 9.0% under time-dependent boundary conditions. Larger errors were observed for the velocity components.

The combination of competitive accuracy and significant computational acceleration highlights the potential of PINOs for time-critical applications such as flood forecasting. Overall, this work positions PINOs as a promising alternative to traditional numerical solvers for the 2D SWE, offering an effective balance between computational efficiency and solution fidelity. Future research will focus on improving predictive accuracy, expanding the diversity of training functions, and enhancing applicability to real-world flood scenarios.

How to cite: Keppler, R., Koch, J., and Fensholt, R.: Learning 2D Shallow Water Equations with Physics-Informed Neural Operator Networks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7423, https://doi.org/10.5194/egusphere-egu26-7423, 2026.

A.78
|
EGU26-22456
|
ECS
Hancheng Ren, Gang Zhao, Louise Slater, Dai Yamazaki, and Bo Pang

Accurate and rapid river forecasting is essential for global water cycle management but faces a persistent dichotomy: physics-based models offer structural consistency but are computationally intensive and difficult to calibrate efficiently, while data-driven approaches offer efficiency but often lack physical interpretability and struggle in data-scarce regions. To bridge this gap, we introduce GraphRiverCast (GRC), a topology-informed AI foundation model that forecasts multivariate river hydrodynamics at a global scale. Unlike conventional raster-based AI approaches, GRC explicitly encodes river network topology into a graph neural architecture. This design underpins a novel "pretrain-finetune" paradigm: the model first learns generalizable river hydrodynamic mechanisms from global physics-based simulations (pre-training), and then adapts to specific basins using sparse in-situ observations (fine-tuning). Our results demonstrate that topological awareness is essential for maintaining predictive accuracy and stability in "ColdStart" mode where initial states are unavailable. Furthermore, we show that fine-tuning with local data propagates observational constraints through the network topology, systematically improving performance even in ungauged river reaches. GRC thus establishes a scalable, physics-aligned framework that effectively synthesizes global hydrodynamic knowledge with local data applicability.

How to cite: Ren, H., Zhao, G., Slater, L., Yamazaki, D., and Pang, B.: GraphRiverCast: A Topology-Informed Foundation Model for Global River Hydrodynamics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22456, https://doi.org/10.5194/egusphere-egu26-22456, 2026.

A.79
|
EGU26-12011
Minglei Hou and Jiahua Wei

The Upper Yellow River serves as the basin's primary water conservation zone and multi-year regulation reservoir. However, the region exhibits frequent alternations between persistent wet and dry cycles, along with abrupt regime shifts, which significantly amplify the uncertainty and complexity of water resource regulation. Under these complex conditions, traditional hydrological models often suffer from deteriorating forecast accuracy and limited lead times, failing to support precise and adaptive decision-making. To address these challenges, this study proposes a Physics-AI coupled framework that integrates Knowledge Graphs (KG) with Large Language Models (LLMs) to create a closed loop from perception to decision-making. First, a multimodal KG was constructed to standardize heterogeneous data and, more critically, to encode hydrological evolution rules as logical constraints for physical reasoning. Driven by this cognitive foundation, we developed a multi-scale forecasting system: the Parallel LSTM-and-Sequence-GPT (PLSG) for daily-scale medium-term forecasting, and the physics-informed Hydro-LSTM for monthly-scale long-term runoff reconstruction. To bridge the gap between forecasting and operation, accurate runoff inputs are integrated into a Mixture of Experts (MoE) framework. Here, autonomous agents dynamically configure scheduling workflows to execute multi-objective optimization, ensuring adaptability across diverse hydrological scenarios. Validation results show that the PLSG model improved 15-day forecast accuracy by 31.3% against baselines, while Hydro-LSTM achieved a NSE of 0.65–0.857. This framework not only enhances forecast resilience but also enables autonomous multi-objective optimization with transparent decision-making pathways, providing a robust and interpretable tool for complex water system management.

How to cite: Hou, M. and Wei, J.: Coupling Knowledge Graphs with Large Language Models for Integrated Runoff Forecasting and Reservoir Operation: A Case Study of the Upper Yellow River, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12011, https://doi.org/10.5194/egusphere-egu26-12011, 2026.

A.80
|
EGU26-7297
|
ECS
Giulia Blandini, Simone Gabellani, Francesco Avanzi, Mirko D'Andrea, Lorenzo Campo, Francesco Silvestro, Marco Falzacappa, Fabio Santamaria, and Luca Ferraris

At the national level in Italy the FloodPROOFs hydrological forecasting chain is operational for flood forecasting, monitoring and emergency management. It is based on the physically based and spatially distributed hydrological model Continuum. While highly reliable, such modelling chains are often computationally demanding, posing limitations for rapid simulations and large-scale operational applications. 

To investigate whether artificial intelligence can support flood forecasting and support and integrate FloodPROOFs with comparable or better skill while significantly reducing computational costs, this study presents an AI-based framework for river water-level emulation designed for operational flood monitoring. The framework integrates a limited yet heterogeneous set of data sources typically available in real-time contexts, including topographic information derived from Digital Elevation Models, meteorological forcings from in situ measurements (precipitation and air temperature), and observed river water levels provided by the National System of Civil Protection and shared in myDEWETRA platform. 

 Convolutional Neural Networks are employed to capture the nonlinear spatial and temporal interactions between terrain characteristics, atmospheric forcing, and hydrological response. The model is trained and fine-tuned using observed water-level time series, enabling the direct simulation of river stage dynamics and the detection of critical threshold exceedances relevant for civil protection warning procedures. 

The proposed framework operates at high spatial resolution over the Italian peninsula while maintaining low computational requirements, making it suitable for near-real-time applications at the centre of the work of the Italian Civil Protection. Its demonstrated generalization capability allows deployment across multiple spatial scales, from individual catchments to regional and national domains. Overall, the results highlight the potential of AI-driven emulators as complementary tools to traditional hydrological modelling chains, enhancing the efficiency and robustness of operational flood forecasting and decision-support systems for civil protection services. 

How to cite: Blandini, G., Gabellani, S., Avanzi, F., D'Andrea, M., Campo, L., Silvestro, F., Falzacappa, M., Santamaria, F., and Ferraris, L.: A deep learning model of river water levels  , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7297, https://doi.org/10.5194/egusphere-egu26-7297, 2026.

A.81
|
EGU26-19024
|
ECS
|
Virtual presentation
Ruben Cartuyvels, Karim Douch, and Diego Fernandez Prieto

Data-driven models are emerging as complementary approaches to numerical methods across the Earth sciences, offering the potential to be computationally efficient and free from physical parameterization bias. We present a neural network trained only on observations, integrating spatially and temporally sparse data from various altimeter instruments, to predict a dense reconstruction of water levels for the Amazon River network. Dense estimates for water level can enable better parameterizations of hydraulic models as well as accurate modeling of discharge in small- to medium-sized catchments.

The Amazon basin hosts the largest rainforest in the World, making the monitoring of its rivers particularly important. Historical records of water level rely on in-situ flow gauges maintained by basin authorities (e.g. ANA in Brazil), offering temporally dense but geographically sparse observations. Since the 1990s, satellite altimetry has provided global yet sparse observations in ungauged areas. The recent SWOT mission introduces unprecedented spatial density thanks to its wide-swath InSAR sensor but lacks historical depth. To synthesize these disparate sources into a homogeneous product, we train an attention-based graph neural network for spatial and temporal densification via masked reconstruction. The model is trained to predict SWOT measurements conditioned on classical altimetry for 2023-2025, so it learns to infer the denser measurements taking only classical altimetry as input. River topology information from the SWORD database determines the decoding order and sparsifies attention interactions in the model architecture, with the aim of learning spatiotemporal dynamics in a physically consistent manner.

We empirically validate the model on spatially and temporally held-out evaluation sets that include in-situ measurements from ANA gauges and benchmark it against an existing hybrid statistical-physical approach. We predict a first version of a reconstruction consisting of daily water level estimates for every SWOT reach in the Amazon basin between 2000 and 2025. This study contributes to the development of neural networks that unify sparse, non-overlapping sensor data without relying on physical approximations. In the future we will integrate complementary observations such as river width derived from imagery or SAR, and extend the framework to other major river basins globally.

How to cite: Cartuyvels, R., Douch, K., and Fernandez Prieto, D.: Data-driven reconstruction of Amazon water levels with deep learning leveraging river topology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19024, https://doi.org/10.5194/egusphere-egu26-19024, 2026.

A.82
|
EGU26-20392
|
ECS
Anandharuban Panchanathan, Mohamad Javad Alizadeh, Indiana Olbert, Mehdi Moayeri, and Sara Jamali

The Caspian Sea, the world’s largest enclosed water body, exhibits significant level fluctuations driven by complex hydroclimatic processes across its vast watershed. Projecting future sea level variations remains challenging due to non-linear interactions, non-stationary climate dynamics, and the basin’s response to anthropogenic climate change. This study develops a novel spatially-explicit deep learning framework to project Caspian Sea level variations under multiple Shared Socioeconomic Pathway (SSP) scenarios.

Our methodology integrates gridded climate data from CMIP6 models with a hybrid CNN-Transformer architecture that explicitly accounts for: (1) spatial heterogeneity across major sub-basins (Volga, Kura, Ural, Terek watersheds), (2) temporal non-stationarity in evaporation, precipitation, and river discharge patterns, and (3) dynamic land-water boundaries in shallow coastal zones. The model employs multi-head attention mechanisms to capture long-range dependencies in climate teleconnections while maintaining physical consistency through a water balance constraint layer.

A critical innovation is our treatment of non-stationary processes where future evaporation rates may exceed historical ranges. We implement adaptive normalization and time-varying parameter modules that learn evolving climate patterns without relying solely on historical statistics. For regions projected to desiccate under extreme scenarios, we incorporate dynamic masking that temporally deactivates precipitation-evaporation fluxes in exposed grid cells.

Spatial analysis reveals differential impacts across sub-basins, with the northern shallow zones showing heightened sensitivity. The attention weights highlight the dominant role of Volga discharge variability and Caspian surface evaporation in controlling decadal-scale level changes.

This physics-informed deep learning approach provides computationally efficient, probabilistic projections while maintaining interpretability through attention visualization and uncertainty quantification. The framework is transferable to other enclosed basins facing similar non-stationary climate challenges.

How to cite: Panchanathan, A., Alizadeh, M. J., Olbert, I., Moayeri, M., and Jamali, S.: Deep Learning-Based Projection of Caspian Sea Level Variations under Climate Change Scenarios: A Spatially-Explicit Non-Stationary Approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20392, https://doi.org/10.5194/egusphere-egu26-20392, 2026.

A.83
|
EGU26-17800
|
ECS
Multi-Resolution Optical Satellite–Based Water Body Detection Using CNN
(withdrawn)
Shinhyeon Cho, Wanyub Kim, Doyoung Kim, Yuju Chun, and Minha Choi
A.84
|
EGU26-2049
Wen-Cheng Liu, Wei-Che Huang, Yen-Ting Yu, Yi-Hong Li, and Bai-Jun Wang

Taiwan is situated in a subtropical region and is surrounded by the ocean, resulting in abundant rainfall and frequent typhoons. As a result, flood-control infrastructure plays a critical role in disaster mitigation. In addition, Taiwan lies within an active seismic zone, where hydraulic structures such as levees and dams are susceptible to earthquake-induced cracking, potentially impairing flood protection and water-supply functions and increasing overall risk. This study develops a crack-detection system for hydraulic structures using the Mask R-CNN deep learning model. The network was trained with 300 images of hydraulic structures and subsequently evaluated using 50 additional images. The proposed system achieved an accuracy of 80%, precision of 81%, recall of 95%, and an F1-score of 88%. Furthermore, the effects of transfer learning on model performance were investigated. The results indicate that two iterations of transfer learning led to notable improvements across all evaluation metrics, confirming that deep learning approaches can provide accurate and efficient crack detection for hydraulic infrastructure.

How to cite: Liu, W.-C., Huang, W.-C., Yu, Y.-T., Li, Y.-H., and Wang, B.-J.: Deep Learning for Crack Detection in Hydraulic Structures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2049, https://doi.org/10.5194/egusphere-egu26-2049, 2026.

Posters virtual: Thu, 7 May, 14:00–18:00 | vPoster spot A

The posters scheduled for virtual presentation are given in a hybrid format for on-site presentation, followed by virtual discussion on Zoom. Attendees are asked to meet the authors during the scheduled presentation & discussion time for live video chats; onsite attendees are invited to visit the virtual poster sessions at the vPoster spots (equal to PICO spots). If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access the Zoom meeting appears 15 minutes before the time block starts.
Discussion time: Thu, 7 May, 16:15–18:00
Display time: Thu, 7 May, 14:00–18:00

EGU26-2567 | ECS | Posters virtual | VPS10

WetFramework: A Deep Learning Framework for Coastal Wetland Boundary Extraction and Inundation Frequency Estimation 

Jintao Liang
Thu, 07 May, 14:12–14:15 (CEST)   vPoster spot A

Coastal wetlands, characterized by their geomorphological sensitivity and tidal dependence, exhibit pronounced vulnerability under global warming. While the persistent threat of sea-level rise to coastal wetlands has been extensively documented at the macroscale, there remains a lack of systematic quantitative frameworks for mapping these trends to the microscale dynamics of wetland evolution. To address this gap, this paper proposes WetFramework, a novel approach for joint modeling of spatial structure and temporal variation in wetlands. (1) In the encoder, Transformer and Mamba modules are integrated to enhance multiscale feature representation through the synergy of global attention and implicit sequence modeling, with a Token-Driven Attention Mechanism (TDAM) designed to facilitate deep interactions between features. (2) In the decoder, a Wavelet-Enhanced Reconstruction Module (WERM) is introduced to improve spatial structure modeling via wavelet transforms, thereby optimizing boundary delineation and fine detail representation for precise mapping of coastal wetland extents. (3) To capture periodic inundation characteristics, a Fourier-Based Inundation Estimation Module (FBIEM) is further proposed, incorporating tidal-height observations to enable unsupervised modeling of pixel-level hydrological responses and quantitative expression of inundation rhythms. Extensive experiments conducted in four representative coastal regions—Yancheng and Dongying (China), Mont-Saint-Michel Bay (France), and San Francisco Bay (USA)—demonstrate that the proposed framework outperforms state-of-the-art models across multiple evaluation metrics and exhibits robust cross-regional generalization and dynamic modeling capabilities. This study provides an effective paradigm for intelligent remote sensing-based wetland identification and long-term hydrological modeling, and offers key hydrological information to support inundation-dynamics monitoring and management decision-making.

How to cite: Liang, J.: WetFramework: A Deep Learning Framework for Coastal Wetland Boundary Extraction and Inundation Frequency Estimation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2567, https://doi.org/10.5194/egusphere-egu26-2567, 2026.

EGU26-15829 | ECS | Posters virtual | VPS10

Uncertainty-Aware Flood Prediction Using Deep Neural Networks Across Multiple Watersheds 

Mostafa Saberian, Vidya Samadi, Thorsten Wagener, and Ioana Popescu
Thu, 07 May, 14:15–14:18 (CEST)   vPoster spot A

Effectively characterizing uncertainty and error in flood prediction is essential for informed decision-making. This study combines advanced deep neural network architectures, i.e., Neural Hierarchical Interpolation for Time Series Forecasting (N-HiTS), and Long Short-Term Memory (LSTM), with multiple uncertainty quantification frameworks to evaluate flood forecasts across several watersheds in the southeastern United States. Bayesian inference, Monte Carlo–based methods, and quantile regression are applied to estimate predictive uncertainty. The comparative analysis examines how different uncertainty approaches perform across a range of flood magnitudes, highlighting their respective advantages and limitations at multiple scales. Results indicate that N-HiTS generally yields narrower and more reliable uncertainty bounds than LSTM. The findings further demonstrate that prior specification in MCMC sampling strongly influences uncertainty estimates and requires careful calibration. While Monte Carlo dropout, which is an approximate Bayesian technique, primarily captures uncertainty near flood peaks, MCMC offers a more complete characterization across the full hydrograph. In addition, this study investigates multi-site training to evaluate model adaptability under diverse hydrological regimes. Collectively, these results advance the integration of deep neural networks and uncertainty quantification to enhance flood modeling capabilities and risk management.

How to cite: Saberian, M., Samadi, V., Wagener, T., and Popescu, I.: Uncertainty-Aware Flood Prediction Using Deep Neural Networks Across Multiple Watersheds, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15829, https://doi.org/10.5194/egusphere-egu26-15829, 2026.

EGU26-19676 | ECS | Posters virtual | VPS10

Modeling Flood Risk in Kalaa Sraghna Region in Morocco Using Explainable Artificial Intelligence Techniques 

Hamza Legsabi, Soufiane Tiai, Sidi Mohamed Boussabou, Nora Najaoui, Bouabid El Mansouri, and Lamia Erraioui
Thu, 07 May, 14:18–14:21 (CEST)   vPoster spot A

Abstract. Predicting flood risk is a complex phenomenon. Several factors influence flood behavior generation and intensity such as intricate interactions between hydrological dynamics, meteorological variability, the overarching influence of climate change and land-use changes. This study explores flood risk within the watershed of Tassaout River located in the central region of Morocco. Three advanced machine learning algorithms were chosen to evaluate flood risk. These algorithms are Multi-Layer Perceptron Artificial Neural Networks (MLP-ANN), Random Forest (RF) and Support Vector Machine (SVM). The models are trained based on 11 different factors derived from remote sensing data. From ALOS digital elevation model, 8 factors are developed: Elevation, Slope, Aspect, Plan Curvature, Profile Curvature, Stream Power Index (SPI), Topographical Wetness Index (TWI), and Surface Roughness. In addition, from Landsat 9 imagery, three flood susceptibility factors are extracted: Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI) and Land Surface Temperature (LST). The predictive performance of each model was assessed using standard classification metrics: accuracy, recall, and F1-score. Results indicate that the RF model performed the best with an accuracy of 100%, SVM algorithm achieved good performance, attaining 68% in accuracy and more than 80% in f1-score. However, the ANN model underperformed compared to the other algorithms, with an accuracy of only 59% in accuracy and 70% in f1-score highlighting its limitations in capturing the decision boundaries within the current data configuration. Furthermore, the Shapley Additive exPlanations model (SHAP) was used to enhance the transparency and interpretability of the modelling results.

How to cite: Legsabi, H., Tiai, S., Boussabou, S. M., Najaoui, N., El Mansouri, B., and Erraioui, L.: Modeling Flood Risk in Kalaa Sraghna Region in Morocco Using Explainable Artificial Intelligence Techniques, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19676, https://doi.org/10.5194/egusphere-egu26-19676, 2026.

EGU26-1031 | ECS | Posters virtual | VPS10

Mass Conserving LSTM with Dual States for Improved Streamflow Prediction through Quickflow and Slow Storage Separation 

Saurabh Toraskar, M Niranjan Naik, Abhilash Singh, and Kumar Gaurav
Thu, 07 May, 15:00–15:03 (CEST)   vPoster spot A

Long-Short Term Memory (LSTM) shows exceptional performance for rainfall-runoff modelling, but lacks physical realism. Efforts to integrate mass conserving into the model architecture didn’t translate to significant improvement in predictive accuracy. We build on the Mass-Conserving LSTM (MC-LSTM) by proposing a novel architecture that incorporates two complementary cell states to represent distinct fast and slow hydrologic memory components while maintaining strict mass balance. We introduce a new partition gate for segregating the mass input for long- and short-term memory, and made required architectural changes to incorporate additional cell state. We benchmarked our model against LSTM and MC-LSTM on CAMELS-IND (158 basins) and CAMELS-US (531 basins) using NSE, KGE, Pearson-r, FHV, FLV, and peak timing/magnitude. For the Indian dataset, MC-LSTM-DS surpasses both LSTM and MC-LSTM across all metrics except Pearson-r and FLV, where it exceeds the performance of LSTM but falls short of MC-LSTM. In the low flow regime (FLV), our model decreases the overestimation of LSTM significantly, while MC-LSTM shows severe underestimation. Analysis of the spatial distribution revealed it to be aligned with hydroclimate, where all the models performed better in humid/tropical climates, while performance lacked in arid regions. Investigation of the cell states revealed that the added cell state represents the long-term processes effectively, while the original cell state captures short-term processes. The change in their relative contributions according to the climate characteristic is observed, thus confirming our hypothesis and also providing an interpretable decomposition of the simulated flows. On the CAMELS-US dataset, MC-LSTM-DS demonstrates equal performance as MC-LSTM and LSTM on NSE, and outperforms all the models in KGE, FHV, and Pearson-r. In FLV, it outperforms all the mass-conserving models by a huge margin and is just short of LSTM. This study proposes a novel mass-conserving model that provides an interpretable prediction. We claim MC-LSTM-DS to be the current state-of-the-art for large sample rainfall runoff modelling, as it showed superior performance across two diverse regions. To the best of our knowledge, this study is the first to investigate the effects of strict mass conservation on the diverse Indian region.

How to cite: Toraskar, S., Naik, M. N., Singh, A., and Gaurav, K.: Mass Conserving LSTM with Dual States for Improved Streamflow Prediction through Quickflow and Slow Storage Separation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1031, https://doi.org/10.5194/egusphere-egu26-1031, 2026.

EGU26-15872 | ECS | Posters virtual | VPS10

High-resolution operational Flood monitoring in India 

Hiren Solanki and Vimal Mishra
Thu, 07 May, 15:03–15:06 (CEST)   vPoster spot A

Flood is a recurrent natural disaster, causing socio-economic losses and affecting millions of people every year. High-resolution and near-real time monitoring of flood disasters is critical for a densely populated and hydro-climatically diverse country like India. Existing operational frameworks in India largely rely on coarse-resolution hydrological models and hydrodynamic models, biased meteorological forecasts, limited gauge networks, missing observed data for model setup, static land use, and deterministic forecasts, which constrain their ability to capture basin heterogeneity, reservoir regulation, agriculture expansion, urban influences, and short-term extremes. Here, we present a high-resolution, integrated operational flood monitoring framework using hydrological, hydrodynamic, and data-driven models to provide 5-day ahead forecasts of streamflow, water level, and flood inundation at more than 350 stations across India. We first evaluate meteorological forecasts from UKMO, KMA, ECMWF, and GEFS products to quantify their spatio-temporal skill and estimated systematic biases across hydro-climatic regimes. Consequently, we apply a knowledge distillation–based bias correction approach trained on observed rainfall and temperature data from the India Meteorological Department (IMD), enabling the physically consistent correction of meteorological inputs. These corrected forecasts are then integrated with a process-based hydrological model and a sequential long short-term memory network augmented with a multi-headed attention mechanism, which explicitly learns temporal dependencies, upstream connectivity, and the dynamic relevance of predictors. The forecasted streamflow is then fed into the large-scale hydrodynamic model to forecast water level and flood inundation maps. The proposed stochastic framework aims to achieve substantial improvements in short-lead flood prediction skill, enhanced representation of peak flows and water levels, and more realistic flood inundation dynamics compared to existing operational systems. By combining machine learning-based forecast correction, high-resolution modelling, and advanced deep learning, this study provides a scalable pathway for next-generation flood early warning systems in India, offering direct benefits for evacuation and rescue operations, reservoir operation, agriculture management, and disaster risk reduction at both national and sub-basin scales.

How to cite: Solanki, H. and Mishra, V.: High-resolution operational Flood monitoring in India, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15872, https://doi.org/10.5194/egusphere-egu26-15872, 2026.

Login failed. Please check your login data.