HS3.6 | Explainable and hybrid machine learning for hydrology and land surface processes
Explainable and hybrid machine learning for hydrology and land surface processes
Co-organized by ESSI1/NP4
Convener: Shijie JiangECSECS | Co-conveners: Ralf LoritzECSECS, Boen ZhangECSECS, Marieke WesselkampECSECS, Sanika BasteECSECS
Orals
| Mon, 04 May, 14:00–15:45 (CEST)
 
Room 2.17
Posters on site
| Attendance Mon, 04 May, 10:45–12:30 (CEST) | Display Mon, 04 May, 08:30–12:30
 
Hall A
Posters virtual
| Thu, 07 May, 14:21–15:45 (CEST)
 
vPoster spot A, Thu, 07 May, 16:15–18:00 (CEST)
 
vPoster Discussion
Orals |
Mon, 14:00
Mon, 10:45
Thu, 14:21
The complex interactions and interdependencies of hydrological and land surface processes within the Earth system pose major challenges for prediction and understanding. Machine learning has become a powerful tool for prediction across these domains, but leveraging its scientific potential goes beyond applying existing algorithms and data. It requires detailed understanding and problem-specific integration of domain knowledge with data-driven techniques to make models more interpretable and enable new process understanding. This session explores how machine learning techniques are currently used to integrate, explain, and complement physical knowledge in hydrology and land surface modeling, including studies of surface and subsurface water dynamics, soil-vegetation interactions, land-atmosphere exchanges, and eco-hydrological processes. Submissions are welcome on topics including, but not limited to:

- Explainability and transparency in data-driven hydrological and land surface modeling;
- Integration of process knowledge and machine learning;
- Domain-specific model development;
- Data assimilation and hybrid modeling approaches;
- Causal learning and inference in machine learning models;
- Data-driven equation discovery;
- Challenges and solutions for hybrid models and explainable AI.

Submissions that present methodological innovation, critically assess limitations, or demonstrate contributions to process understanding across scales are especially encouraged.

Orals: Mon, 4 May, 14:00–15:45 | Room 2.17

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
14:00–14:05
14:05–14:15
|
EGU26-18384
|
On-site presentation
|
Anneli Guthke, Manuel Álvarez Chaves, Eduardo Acuna Espinoza, and Uwe Ehret

Despite great success of deep learning models in many applications of hydrological prediction, they still face limitations in predicting extreme events or in generalizing to unseen conditions, which raises questions about their fidelity and applicability beyond purely operational purposes. Physics-informed hybrid modelling is often proposed as a way to install interpretability and enable trustworthy data-driven predictions that are in agreement with theoretical knowledge. Yet, the community is still in search of best practices for how to construct physics-informed machine learning models – several “entry points” for physics knowledge exist, i.e., the loss function, the model inputs, or the architecture. Here, we focus on the latter, and on arguably the most “constrained” form of bringing in physics into a hybrid model: a traditional, process-based (conceptual) hydrological model is combined with a data-driven component (here: a long short-term memory network, LSTM) that modifies its parameters over time, as learned by training on observed discharge values. For this apparently well-constrained scenario of hybrid modelling, we raise the question if it can faithfully be called “physics-constrained”, or if the data-driven component is able to overwrite these constraints for the sake of increased performance.

To objectively address this question, we introduce an entropy-based method to quantify the “activity” of the data-driven component in acting against the conceptual constraints. This metric is complemented with a diagnostic workflow to better understand the internal functioning of the resulting, effective hybrid model structure in predicting discharge. Through didactic examples, inspired by real-world case studies, we present the method and build an intuition of what our entropy-based metric represents. Further, we discuss selected results from a large-sample case study on CAMELS-GB to illustrate the variety of findings and insights we had: (1) Performance heavily relies on the data-driven component, and the physics constraints often even make the prediction problem harder instead of adding helpful information; (2) the data-driven component tends to overwrite the constrained architecture “silently”, but this can be detected with our proposed workflow; (3) even nonsensical-at-first-sight constraints can in fact increase performance, as the hybrid model is transformed into a  new structure that is parsimonious and efficient; (4) claiming interpretability on the basis of prescribed constraints is risky at best – before calling a hybrid model of this type interpretable, we should carefully check what’s happening inside. Overall, these findings provide fundamental guidance towards (hybrid) model building and will help us find better ways to reconcile knowledge and information in data for trustworthy models.

How to cite: Guthke, A., Álvarez Chaves, M., Acuna Espinoza, E., and Ehret, U.: Physics-constrained or physics-ignored? An entropy-based approach to diagnose if your hybrid model effectively skips conceptual constraints, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18384, https://doi.org/10.5194/egusphere-egu26-18384, 2026.

14:15–14:25
|
EGU26-10856
|
ECS
|
On-site presentation
Tibor Rapai, Petra Baják, István Gábor Hatvani, András Lukács, and Balázs Székely

Long Short-Term Memory (LSTM) neural networks have proven their excellence in basin-level discharge prediction, provided there is an adequate amount of high-quality time series data available for training, including meteorological forcings and streamflow gauge measurements. Such data-driven black-box models can successfully learn the complex behavior of delayed hydraulic responses; however, they cannot yet be easily applied in water management practice, and model transfer attempts to ungauged catchments have not been entirely successful.

In our previous work, we explored an approach to characterizing near-surface flow regimes, starting from a full catchment model and then applying a single LSTM network layer within a semi-distributed subbasin setup reflecting downstream topography. Application to the Tarna River catchment area in Hungary (2,116 km2) showed that transfer learning from the full catchment model (achieving an NSE of 0.91 on the training set and 0.66 on an independent test set) to a downstream chain of gauged Hydrological Response Units (HRUs) is a powerful tool for investigating a semi-distributed HRU network. The entire setup, however, involves a much higher level of complexity, and the available detailed meteorological data and gauge measurements in only two-thirds of the subbasins did not provide sufficient information for the single LSTM model to fully predict the HRU network processes.

Because these models apply “virtual water amounts” stored in the hidden cells of the LSTM network for discharge estimation, their internal variables lack direct physical interpretability. In the present research, we investigate how data fusion during calibration with Gravity Recovery and Climate Experiment (GRACE) data, downscaled using soil water content and evapotranspiration products from the ECMWF Reanalysis (ERA5) database, can improve predictive performance, and help to verify our working hypothesis regarding the theoretical connection between Near Surface Water Content (NSWC) and LSTM cell states.

These results can also validate interpretations derived from our model concerning baseflow contributions and recharge-discharge classification of subbasins, while promising realistic transferability of the pre-trained lumped catchment model to all subbasins and broader general applicability of the proposed method. We hypothesize that the daily change dynamics of Terrestrial Water Storage (TWS) and NSWC – the latter playing a decisive role in gravitational flows within the Critical Zone – are strongly correlated.

Accordingly, we propose using downscaled TSW estimates to (1) introduce a new term into the loss function based on our working hypothesis relating median LSTM cell state values to the normalized dynamics of NSWC, and (2) add a new input dimension approximating total runoff as precipitation minus evapotranspiration and infiltration.

Furthermore, the current model extension, still based on 0.1 ° gridded input data, prepares the ground for future developments that incorporate high-spatial-resolution satellite remote sensing data, such as Sentinel-2 NDWI, to support local-scale hydrological applications efficiently. Integrating satellite data products with different temporal and spatial resolutions is not a straightforward calibration step for rainfall-runoff models, as pixel-wise normalization of measurements requires complex physically based geostatistical methods compatible with model logic to avoid performance deterioration.

 

How to cite: Rapai, T., Baják, P., Hatvani, I. G., Lukács, A., and Székely, B.: Calibration of a Long Short-Term Memory (LSTM) rainfall-runoff model using remote sensing soil water content estimations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10856, https://doi.org/10.5194/egusphere-egu26-10856, 2026.

14:25–14:35
|
EGU26-107
|
ECS
|
On-site presentation
Hyekyeng Jung, Chris Soulsby, Songjun Wu, Christian Birkel, and Dörthe Tetzlaff

Compared to process-based models (PBMs), higher prediction accuracy of machine learning models (MLMs) has been repeatedly reported in ecohydrological research. This might indicate the higher efficiency of data-driven MLMs for extracting and generalising information from the data, especially when traditional PBMs are often challenged by epistemic uncertainties in process representation. To preserve ‘modelling as a learning tool’, integrating MLMs into PBMs is a promising avenue to leverage MLMs for data assimilation, and PBMs for holistic explainability of processes across the Critical Zone (i.e., the thin crust of the Earth including vegetation).
One example of an ecohydrological process with high epistemic uncertainties is the mixing mechanism of root uptake water from soils by trees. Due to limited process understanding together with high uncertainties of isotope measurements in trees, usually mixing dynamics in tree water storage in ecohydrological models show poor representation.
Here, we use data from a comprehensive monitoring campaign which has been conducted during the growing season of 2020 at a plot site with two willow trees and grass in southeastern Berlin, Germany, including daily or sub-daily in-situ measurements of hydrological characteristics and stable water isotopes in precipitation, soils, vegetation, and neighboring open water bodies. Using the data, a baseline ecohydrological PBM (EcoHydroPlot) was used to simulate water flow and isotope dynamics across the Critical Zone. In addition, MLMs with different strategies for integration were applied: Firstly, as an additional module to the PBM, a post-hoc result-analyzing MLM was trained with the error of the PBM. Secondly, a hybrid model was built that replaces equations for mixing mechanism of root-uptake water in PBM with a data-driven ML algorithm. An eXplainable AI (XAI) tool was applied to help understand uncertainties in the PBM and process representation in MLM.
By comparing these approaches using different criteria of prediction accuracy and interpretability, we identified an optimal strategy for leveraging MLM capabilities within PBM frameworks in addressing the process of tree water mixing with high epistemic uncertainties, potentially extending the concept of ‘modeling as a learning tool’ to MLM-integrated PBMs.

How to cite: Jung, H., Soulsby, C., Wu, S., Birkel, C., and Tetzlaff, D.: Machine Learning Integration Strategies for Process-based Ecohydrological Modeling: Addressing Epistemic Uncertainties of Water Mixing Dynamics in Tree Water, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-107, https://doi.org/10.5194/egusphere-egu26-107, 2026.

14:35–14:45
|
EGU26-9189
|
ECS
|
On-site presentation
Mizuki Funato and Yohei Sawada

Accurate rainfall-runoff analysis is vital for flood prediction, water resources management, and climate impact assessment. While data-driven hydrological models such as Long Short-Term Memory (LSTM) networks have shown promise, developing a globally applicable framework that is accurate, interpretable, and computationally efficient remains a grand challenge, primarily because most catchments worldwide are ungauged. We address this by employing HYdrologic Prediction with multi-model Ensemble and Reservoir computing (HYPER). This hybrid method combines Bayesian Model Averaging (BMA), a multi-model ensemble, with Reservoir Computing (RC), a type of machine learning model. The framework infers model weights for ungauged basins by linking catchment attributes to the model weights learned from gauged basins. While this model has previously demonstrated higher accuracy and lower uncertainty compared to LSTMs, particularly when training data is limited, its global applicability remains unassessed. Therefore, in this study, we evaluate the global applicability of HYPER using a pseudo-ungauged approach, where gauged basins are treated as ungauged for validation. We challenge the conventional assumption that more data is better by investigating whether selecting a strategic subset of gauged basins for training outperforms using the entire available dataset. Initial experiments revealed that prediction accuracy remained robust regardless of whether 90 % or only 3 % of available basins were used for training. Furthermore, training on basins from a single, hydrologically similar region often yielded higher accuracy than training on a diverse multi-regional dataset. To identify the optimal training subset, we compared three distinct data selection methods: 1) Greedy selection, which identifies donor basins by selecting the nearest neighbors within the static catchment attribute state space; 2)  Physics-Informed selection, which calculates the distance between target and candidate basins while applying heavier penalty weights to slope and aridity to strictly enforce physical similarity; and 3) Meta-Learning, which utilizes a Random Forest to learn the relationship between attribute differences and model weight correlations, subsequently predicting donor basins expected to have the highest weight correlation with the target. While all three methods outperformed the baseline of using all available data (Kling-Gupta Efficiency (KGE): 0.12), the Physics-Informed and Meta-Learning approaches achieved the highest consistency and accuracy. Even when only 5 out of 1,505 basins were used for training, these methods achieved KGE scores of 0.26 and 0.31, respectively, effectively bridging the performance gap toward fully gauged basins (KGE: 0.54). These findings demonstrate that for global prediction in ungauged regions, data quality, especially the strategic selection of training basins, is more important than data quantity, marking a step towards robust, globally applicable runoff analysis.

How to cite: Funato, M. and Sawada, Y.: Data Quality over Quantity: Optimized Data Selection for Data-driven Global Prediction in Ungauged Basins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9189, https://doi.org/10.5194/egusphere-egu26-9189, 2026.

14:45–14:55
14:55–15:05
|
EGU26-17916
|
ECS
|
On-site presentation
Alexander Sasse, Ralf Ludwig, Julius Weiß, and Kerstin Schütz

Both river runoff and river water temperature are experiencing highly dynamic alterations, posing serious threat to aquatic ecosystems and water resources management under climate change. Data-driven models such as Long Short-Term Memory (LSTM) networks have demonstrated remarkable skill in hydrological prediction, yet their application under non-stationary climate conditions remains challenging due to limited generalization to unseen catchments and conditions beyond the training distribution. We address these challenges by combining LSTM architectures with single-model initial condition large ensemble (SMILE) climate projections to assess non-stationary, non-linear hydrological responses considering the full range of internal climate variability and climate change, enabling robust assessment of rare and extreme events in Bavaria, Germany.

Our study builds on the ClimEx project, which provides a 50-member ensemble of climate simulations (1950–2099, RCP8.5 emission scenario) at 12 km resolution over Europe using the Canadian Regional Climate Model CRCM5.

We present two complementary application cases operating daily and at 3-hourly temporal resolution: i) For discharge prediction, we train an LSTM on observed runoff across 98 Bavarian catchments, validated against simulations from the process-based Water balance Simulation Model (WaSiM). The architecture processes dynamic meteorological forcings through stacked LSTM layers while incorporating static catchment attributes, using a composite loss function that balances performance across high and low flows. The trained model is then driven by the ClimEx ensemble to generate probabilistic discharge projections for future climate. ii) For water temperature (Tw) prediction, we developed an Entity-Aware LSTM (EA-LSTM) framework trained on observations from 44 Bavarian gauging stations, a subset of the 98 catchments constrained by Tw data availability, extended with nine French river basins to broaden the climatic gradient encountered during training. The EA-LSTM architecture explicitly separates static catchment attributes (elevation, slope, upstream river length) from dynamic meteorological forcings, using static features to parameterize the input gate rather than concatenating them at every timestep. This allows the network to learn site-specific temporal dynamics without overfitting individual locations.

To enhance model interpretability, we apply explainable AI (XAI) techniques including permutation-based feature importance analysis. Results reveal that air temperature and radiation dominate Tw predictions overall, while topographic attributes gain importance under thermal extremes, indicating the model captures physically meaningful process controls. Additionally, robustness tests with perturbed static inputs confirm smooth performance degradation rather than abrupt collapse, suggesting the EA-LSTM learns generalizable attribute-response relationships rather than memorizing site identities.

Both cases demonstrate how combining diverse training data with ensemble-based climate projections enables more robust predictions of hydrological extremes under climate change, while XAI methods provide transparency into learned representations.

How to cite: Sasse, A., Ludwig, R., Weiß, J., and Schütz, K.: Combining LSTMs with a Single-Model Large Ensemble for Runoff and Water Temperature Projections in Bavaria, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17916, https://doi.org/10.5194/egusphere-egu26-17916, 2026.

15:05–15:15
|
EGU26-2784
|
ECS
|
On-site presentation
Shiqiong Li, Haiyan Yang, Taihua Wang, and Dawen Yang

The Yellow River Basin (YRB) is among the most water-scarce, sediment-laden, and anthropogenically impacted river basins worldwide. Rainfall–runoff and runoff–sediment relationships in the YRB have traditionally been investigated using process-based hydrological models, which are computationally demanding and difficult to apply at large spatial scales. Here, a physics-guided LSTM–GNN (Long Short-Term Memory and Graph Neural Network) framework was proposed to simulate coupled water–sediment processes across the YRB. Using sub-basin delineation and upstream–downstream connectivity derived from the physically based Geomorphology-Based Ecohydrological Model (GBEHM), the framework employs LSTM to learn local runoff and sediment generation within individual sub-basins, and GNN to represent topology-constrained routing along the river network. The coupled model generated monthly streamflow and sediment data for 718 sub-basins over the period 1982–2017. Compared with a baseline model that neglects physical river-network topology (total NSEflow=0.78, NSEsediment=0.62; median NSEflow=0.09, NSEsediment=0.13), the proposed framework demonstrated significantly improved predictive performance (total NSEflow=0.89, NSEsediment=0.85; median NSEflow=0.42, NSEsediment=0.32) during the test period (2013–2017), especially at stations in large tributaries and the main stream, with high connectivity and large catchment areas. These results show that the proposed LSTM-GNN framework can effectively serve as a surrogate of the process-based model with high accuracy, highlighting its potential for simulating upstream–downstream coupled hydrological processes in super-large river basins.

How to cite: Li, S., Yang, H., Wang, T., and Yang, D.: Coupled Water–Sediment Modelling in the Yellow River Basin Using a Physics-Guided LSTM–GNN Framework Incorporating River Network Topology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2784, https://doi.org/10.5194/egusphere-egu26-2784, 2026.

15:15–15:25
|
EGU26-5408
|
On-site presentation
Valeriya Filipova, David Leedal, and Sam Clayton

Reliable estimation of the median annual maximum flood (QMED) is central to flood risk assessment and the design of hydraulic infrastructure, particularly in ungauged basins. Traditional index-flood approaches typically delineate homogeneous regions and estimate QMED using linear regression on a small set of catchment descriptors. However, these assumptions are often violated in practice, leading to substantial prediction uncertainty. 

Here, we explore the potential of explainable machine-learning models to estimate QMED at large scale. Using data from approximately 8,500 catchments and more than 60 climatic, physiographic, and geomorphological descriptors, we train non-linear models (XGBoost and TabNet) to predict QMED for ungauged basins. To promote physically plausible behaviour, model training incorporates constraints on specific discharge alongside standard performance metrics. A key feature of the approach is the extensive use of DEM-derived terrain and river-network descriptors, which can be computed consistently from widely available global elevation datasets. 

Model interpretability is addressed using global and local explainability techniques, enabling identification of the dominant controls on QMED and how their importance varies spatially. Across independent test data, the models show strong predictive skill (R² > 0.8, median absolute percentage error ~30%). Notably, in many regions models trained on large, globally diverse datasets outperform those trained solely on local data, even where substantial local records are available. 

These results indicate that combining globally consistent physiographic information with interpretable, non-linear machine-learning models offers a promising alternative to traditional regional regression methods for QMED estimation, with potential benefits for flood risk assessment in data-sparse regions. 

How to cite: Filipova, V., Leedal, D., and Clayton, S.: Global estimation of the median annual maximum flood (QMED) using explainable machine learning , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5408, https://doi.org/10.5194/egusphere-egu26-5408, 2026.

15:25–15:35
|
EGU26-1116
|
On-site presentation
Ernesto Canellas, Rodrigo Perdigão, Bruno Brentan, and André Rodrigues

Hydrological modelling is essential for water resource management, decision making, extreme events forecasting, and for advancing an integrated understanding of the water cycle. In this context, two main approaches dominate: physics-based (or process-based) models, which simulate hydrological processes such as streamflow using fundamental physics equations, and data-driven models, which use statistical or machine learning techniques to map inputs to outputs. Although Artificial Intelligence (AI) techniques have shown promising results in predictive accuracy, particularly in data-rich basins, their inherently black-box nature raises concerns about whether their internal representations align with real hydrological processes. This is especially critical when models are applied to extreme events, non-stationary conditions, or scenarios beyond the training distribution, where high performance metrics alone may not guarantee reliable or physically meaningful predictions. In this study, we evaluated the performance of a Long Short-Term Memory (LSTM) model for drought modelling modeling and assessed how effectively it could represent real-world hydrological behavior in the Rio Grande do Sul watersheds available in the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS-BR) dataset. The focus on these basins is particularly relevant given the region's hydrological importance, susceptibility to extreme events (e.g., droughts and floods), and distinct characteristics compared to temperate regions, where most legacy models were developed. The model was trained using data from 55 different basins across the state. This multi-basin approach allows the LSTM to learn universal hydrological patterns while maintaining the ability to predict low flow conditions in individual watersheds. The model inputs combined dynamic hydrological variables (e.g., precipitation and evapotranspiration) with static catchment attributes  (e.g., aridity, soil properties, and topography). Accumulated rainfall features were constructed over 3-30 day windows to capture watershed memory effects as a proxy to soil moisture dynamics. In addition, Explainable AI (XAI) techniques together with hydrological signatures (e.g. runoff ratio, baseflow index and elasticity) were applied to assess the physical soundness of the LSTM model in the region. Following this, the internal structure of the LSTM - particularly the cell states - were analyzed and compared with hydrological behavior (e.g., soil water accumulation, groundwater dynamics, rainfall inputs) in both situations where XAI and hydrological signatures highlighted, or did not highlight, physical consistency. The LSTM’s effectiveness in Brazilian watersheds highlighted its potential as a complementary tool for low flow and drought modelling, offering a valuable alternative for water resources management. XAI analyses and hydrological signatures highlighted the physical soundness of the multi-basin model, but also indicated that improvements were needed, as the internal structure did not consistently track physical hydrological behavior in some cases, hindering the extrapolation of the LSTM model to assess drought conditions in different meteorological settings (e.g., climate change scenarios).

How to cite: Canellas, E., Perdigão, R., Brentan, B., and Rodrigues, A.: Beyond Accuracy: Trustworthy LSTM-Based Hydrological Modelling Assessed with XAI and Hydrological Signatures — A Case Study in Rio Grande do Sul, Brazil, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1116, https://doi.org/10.5194/egusphere-egu26-1116, 2026.

15:35–15:45

Posters on site: Mon, 4 May, 10:45–12:30 | Hall A

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Mon, 4 May, 08:30–12:30
A.25
|
EGU26-928
|
ECS
Mayra Perez, Frédéric Satgé, Jorge Molina, Renaud Hostache, Ramiro Pillco, Elvis Uscamayta, Diego Tola, Lautaro Bustillos, and Celine Duwig

To improve crop yields and economic incomes, farmers consistently adapt their practices to climate and market fluctuations, resulting in highly variable crop field distribution and coverage in space and time. As these dynamics ilustrate, up-to-date crop-type mapping is essential to understand farmers’ needs and supporting them in adopting sustainable practices. With global coverage and frequent temporal observations, remote sensing data are generally integrated in machine learning models to monitor crop-type mapping dynamics. Unlike physical-based models that rely on straightforward use, the implementation of machine-learning approaches depends on deep interaction with users. In this context, the study assesses the output sensitivity of these models to features selection and hyper-parameter calibration, both of wich rely on user consideration. To do so, Sentinel-1 (S1) and Sentinel-2 (S2) features are integrated into five distinct models (RF, SVM, LGB, HGB, XGB), considering different features selection (VIF and SFS) and hyper-parameter calibration set-up. Results show that pre-process modeling VIF feature selection discards features that wrapped SFS feature selection keeps, resulting in less reliable crop-type mapping compared to using SFS. Additionally, hyper-parameter calibration appears to be sensitive to the input feature and its consideration after any the feature selection improved the crop-type mapping. In this context a three-step nested modelling set-up including a first hyper-parameters calibration followed by a wrapped feature selection (SFS) and another hyper-parameter calibration, lead to the most reliable model outputs. Across the considered region, LGB and XGB (SVM) are the most (less) suitable model for crop-type mapping and models reliability improved when integrated S1 and S2 features rather than the consideration of S1 or S2 alone. Finally, crop-type maps are derived across different regions and periods to highlight the benefits of the proposed method to monitor crops’ dynamics in space and time.

How to cite: Perez, M., Satgé, F., Molina, J., Hostache, R., Pillco, R., Uscamayta, E., Tola, D., Bustillos, L., and Duwig, C.: Sensitivity of machine-learning crop-type mapping to feature selection and hyper-parameter tuning., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-928, https://doi.org/10.5194/egusphere-egu26-928, 2026.

A.26
|
EGU26-1275
|
ECS
Vidhi Singh, Abhilash Singh, and Kumar Gaurav

Accurate characterization of soil moisture at subsurface depths is essential for hydrological modeling, agricultural management, and climate risk assessment. However, in-situ subsurface measurements remain sparse and often discontinuous due to logistical and operational constraints, especially in data-limited regions. This creates a pressing need for approaches that can reliably infer deeper soil moisture states from surface observations, which are more readily available from both remote sensing platforms and ground-based sensors. This study proposes a probabilistic, physics-aware denoising diffusion model designed to estimate soil moisture at subsurface depths using only surface moisture measurements. The model integrates smoothness and curvature regularization terms inspired by Fickian diffusion theory as weak physics to guide the learning process, without requiring explicit or site-specific physical parameters, thereby enhancing its practicality and ensuring broader applicability across diverse hydroclimatic conditions. The model is trained and evaluated across 20 global ISMN (International Soil Moisture Network) sites at 10, 20 and 40 cm depths with hourly observations spanning six distinct Köppen–Geiger climate classes and four high-resolution African stations with 10-min data.

Across global stations, the model demonstrated consistently high predictive skill (R² ranging from 0.91 to 0.99) with lower errors in climates characterized by stable seasonal patterns, and comparatively higher uncertainty in regions affected by freeze-thaw dynamics or monsoonal variability. Benchmarking against 17 state-of-the-art algorithms using Dolan–Moré profiles showed strong and reliable performance across depths and metrics. A stochastic robustness analysis with 30 random seeds and varying ensemble sizes indicated that moderate-sized ensembles provide an effective balance between accuracy and stability. Sensitivity experiments with white, autocorrelated, and structured noise revealed that the 20 cm layer is most susceptible to surface-level perturbations, while deeper layer remain comparatively resilient. The model also highlighted a strong performance on higher-resolution datasets, with prediction errors tightly centered around zero and exhibiting very low standard deviation. The generalisation of the proposed diffusion-based model across spatial, temporal, and climatic variability highlights its potential as a lightweight and transferable alternative for hydrological forecasting in data-scarce or operationally constrained environments.

How to cite: Singh, V., Singh, A., and Gaurav, K.: Diffusion-Based Physics-Aware Modeling of Subsurface Soil Moisture, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1275, https://doi.org/10.5194/egusphere-egu26-1275, 2026.

A.27
|
EGU26-5540
|
ECS
Xinyuan Qian, Ping-an Zhong, Bin Wang, Yu Han, Yukun Fan, Yiwen Wang, Sunyu Xu, Zixin Song, and Mengxue Ben

Accurate and reliable long-term streamflow forecasting plays a crucial role in sustainable water resource management and risk mitigation. However, forecast performance is often constrained by multiple sources of uncertainty and the limited interpretability of deep learning models. To address these challenges, this study proposes an explainable hierarchical optimisation framework for long-term streamflow forecasting based on ensemble learning. The proposed framework systematically integrates a Dempster–Shafer (DS) evidence theory-based predictor selection strategy to reduce input uncertainty, an improved loss function designed to enhance model sensitivity to extreme flow events, and a Stacking ensemble scheme that combines the complementary strengths of multiple deep learning models, thereby overcoming the limitations of individual models in complex hydrological systems. In addition, SHapley Additive exPlanations (SHAP) are employed to improve model interpretability and to quantify the contributions of different predictors.

The effectiveness of the proposed framework is demonstrated through long-term streamflow forecasting at Hongze Lake. The results indicate that: (1) the DS-based predictor selection method substantially enhances both forecasting accuracy and stability, with Nash–Sutcliffe efficiency (NSE) values increasing by 0.10–0.18; (2) the improved loss function significantly strengthens model robustness under extreme high-flow conditions, reducing the mean absolute percentage error (MAPE) by 63.11%, 55.33%, and 23.6% for the MLP, LSTM, and Transformer models, respectively; (3) the Stacking ensemble model consistently outperforms individual base models by reducing forecast errors (RMSE decreased by 17–25%), improving the representation of large-scale variability (MAPE reduced by 21.6–26.8%), and more accurately capturing streamflow dynamics (NSE increased by 0.12–0.20), effectively mitigating multi-source uncertainties; and (4) SHAP-based interpretability analysis reveals pronounced monthly variations in predictor importance and confirms the dominant influence of antecedent streamflow on long-term forecasts. Overall, the proposed framework markedly improves the accuracy, robustness, and transparency of long-term streamflow forecasting and shows strong potential for application in other data-driven hydrological forecasting tasks.

How to cite: Qian, X., Zhong, P., Wang, B., Han, Y., Fan, Y., Wang, Y., Xu, S., Song, Z., and Ben, M.: A Deep Ensemble Learning Framework with Interpretability for Long-Term Streamflow Forecasting under Multiple Uncertainties, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5540, https://doi.org/10.5194/egusphere-egu26-5540, 2026.

A.28
|
EGU26-5240
Tomasz Berezowski

Vegetation mapping is a key step in wetland monitoring, management, and conservation. Remote sensing image classification offers an excellent solution for vegetation mapping due to its high temporal and spatial resolution. In spite of these advantages, remote sensing classification of wetland vegetation is usually limited to a small number of target classes and lack explanation of the input features importance. To address this limitation, this study presents a detailed wetland vegetation classification, which is followed by an explainability study.

The study was conducted in the Biebrza wetlands located in NE Poland, covering approximately 220km2. These wetlands are situated around the Biebrza River, which floods yearly, producing a characteristic vegetation zonation. The training and validation data for vegetation classification was a vegetation survey conducted in 2015 and kindly provided by the Biebrza National Park.

The input features for classification was obtained from fusing VIS-IR data from Sentinel-2, thermal data from Landsat-8, and Synthetic Aperture Radar (SAR) data from Sentinel-1. The Sentinel-2 data consisted of four images (one image per season), each with eleven bands. The Landsat-8 data also comprised four images, with one thermal band per image. The Sentinel-1 data included 24 dual-polarization (VV+VH) images (one image per month, varied by ascending and descending orbit). All image data were acquired within the 2014-2017 period and resampled to 10-meter spatial resolution.

The "ranger" Random Forest implementation in R was used as the classifier. The classifier was trained on a stratified random 50% of the vegetation data points and validated on the remaining 50%. The built-in permutation feature importance algorithm was used to indicate the most important bands for the classification.

The classification-based vegetation map highly reflected the characteristic vegetation zonation of the Biebrza wetlands. The overall accuracy was 0.994 and the Kappa index was 0.993. The most important band for the classification was the Landsat-8 thermal image from the winter season. However, the thermal bands from the remaining seasons were relatively unimportant. The next most important bands were the Sentinel-2 VIS-IR images from the spring and fall seasons, particularly the red, red-edge, and SWIR bands. The SAR data from Sentinel-1 were the least important of all data used; the most important Sentinel-1 band (19th position) was VH from September, descending orbit.

How to cite: Berezowski, T.: Explainable machine learning for detailed wetland vegetation classification using remote sensing data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5240, https://doi.org/10.5194/egusphere-egu26-5240, 2026.

A.29
|
EGU26-5830
|
ECS
Jiaxin Xie, Zavud Baghirov, Markus Reichstein, and Martin Jung

Groundwater provides drinking water for billions and supports nearly half of irrigated agriculture, yet global renewable groundwater availability—quantified as groundwater recharge—remain highly uncertain. Here, we simulate global groundwater recharge using a hybrid model that seamlessly integrates machine learning with physical processes. The hybrid model substitutes machine learning for poorly represented hydrological processes while retaining established physical equations, such as water balance. By leveraging diverse Earth system observations—including streamflow-derived groundwater discharge, satellite-retrieved terrestrial water storage anomalies, and flux tower evapotranspiration—the hybrid model effectively integrates process knowledge with multi-source data constraints to improve the accuracy of global groundwater recharge simulations. Such integration may also deepen our process understanding of groundwater recharge.

How to cite: Xie, J., Baghirov, Z., Reichstein, M., and Jung, M.: Global groundwater recharge estimation through hybrid modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5830, https://doi.org/10.5194/egusphere-egu26-5830, 2026.

A.30
|
EGU26-8707
|
ECS
Weiming Kang

Soil moisture is a fundamental hydrological variable that governs groundwater recharge and agricultural productivity. Accurate long-term forecasting is essential for water resource management, yet it remains challenging due to significant observational noise in sensor data and the error propagation inherent in traditional deep learning models. While physics-based models struggle with site-specific calibration and Neural Ordinary Differential Equations (Neural ODEs) often fail to recover stable continuous dynamics from noisy, discretely sampled signals, there is a clear need for a more robust forecasting framework.

In this work, we propose EulerNet, a pragmatic discrete-time framework designed for high-fidelity soil moisture prediction. Instead of attempting to reconstruct complex latent continuous-time vector fields, EulerNet explicitly models the fixed-step mapping required for operational forecasting. The architecture integrates an Euler-style residual update to parameterize one-step tendencies, ensuring numerical stability through its incremental integration form. To mitigate the impact of sensor noise, we incorporate a Random Synthesizer feature mixer. By employing input-independent alignment matrices rather than dynamic self-attention, the Random Synthesizer acts as an implicit regularizer, preventing the model from overfitting to spurious, noise-induced correlations.

We evaluated EulerNet using high-noise in-situ observations. In a one-month autoregressive rollout, the model achieved exceptional performance with R2 = 0.7977, RMSE = 0.0039, and RMAE = 0.0083. These results demonstrate that for fixed-step environmental forecasting, a specialized discrete-time formulation can effectively bypass the complexities of continuous-time modeling while maintaining high stability and accuracy under significant noise. Our findings provide a practical and efficient alternative for modeling complex Earth system dynamics from real-world observational data.

How to cite: Kang, W.: EulerNet: A Robust Discrete-Time Framework for Long-Term Soil Moisture Forecasting Under Significant Observational Noise , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8707, https://doi.org/10.5194/egusphere-egu26-8707, 2026.

A.31
|
EGU26-12581
|
ECS
|
Highlight
David Strahl, Urmi Ninad, Sebastian Gnann, Karoline Wiesner, and Thorsten Wagener

Hydrological and land surface models rely on strong prior assumptions about system functioning, including which processes are represented, their parametrization and how they are simplified across space and time. Model evaluation, however, is often based on measures of predictive performance that provide limited insights into whether models capture underlying processes correctly. Causal discovery methods offer a complementary perspective by learning causal interaction networks directly from time series data to reveal how system components influence each other. Here, we apply the PCMCI+ algorithm for causal discovery in combination with a causal effect estimation to hydrometeorological observations and model simulations from 671 U.S. catchments to infer monthly causal interaction networks and associated effect strengths. We show that inferred interaction strengths vary systematically across gradients of water and energy availability and reflect structural differences in how three hydrological models represent key processes of snow and evapotranspiration dynamics. Our results illustrate how causal inference can complement traditional model evaluation approaches in complex environmental systems by providing process-level insights that help bridge theory, observations, and models across disciplines.

How to cite: Strahl, D., Ninad, U., Gnann, S., Wiesner, K., and Wagener, T.: Causal Analysis for Model Evaluation in Large Sample Hydrology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12581, https://doi.org/10.5194/egusphere-egu26-12581, 2026.

A.32
|
EGU26-12259
|
ECS
Shekoofeh Haghdoost, Shujie Cheng, Oscar Baez-Villanueva, and Diego G. Miralles

Rooting depth Zr is a key variable controlling plant water uptake, soil–vegetation interactions, and land–atmosphere feedbacks. Despite its importance, global estimation of Zr remains challenging due to sparse in situ observations and strong spatial heterogeneity driven by climatic, edaphic, and vegetation controls. The interaction among these factors increases complexity, limiting the performance of traditional process-based models and leading to substantial uncertainty in large-scale applications. In this context, machine learning offers a data-driven alternative that can integrate heterogeneous datasets and capture nonlinear relationships and complex interactions among environmental variables, providing a flexible framework for improving large-scale estimates of rooting depth.

In this research, we investigate the environmental drivers of rooting depth at the global scale and develop a new spatially explicit Zr dataset using advanced machine learning methods. Our framework integrates multiple globally consistent datasets, including satellite-derived vegetation metrics (LAI, NDVI), land-surface temperature, and gridded climate variables (precipitation, radiation). These are complemented by soil hydraulic and physical attributes from global soil databases and detailed topographic information, providing a complete representation of environmental controls relevant to rooting depth. A Random Forest model is employed to capture the nonlinear relationships between the predictor set and observed rooting depths. Model interpretability is subsequently assessed using Shapley Additive exPlanations (SHAP), thereby quantifying the contribution of each environmental variable to model predictions.

The optimized model is subsequently applied at the global scale to generate a global Zr dataset using globally available plant, soil, and climate variables. By accounting for their combined effects, the model provides a spatially continuous representation of rooting depth across diverse regions. Model performance is evaluated using leave-one-out cross-validation (LOOCV), whereby each observation is iteratively excluded from the training dataset and used for independent validation. In addition, the resulting predictions are compared against existing global rooting depth datasets to evaluate large-scale consistency. The new Zr dataset enables improved drought monitoring capabilities through more realistic estimates of plant available water; it may enhance water resource assessments by refining infiltration and groundwater recharge estimates, and it helps reduce uncertainty in land surface and climate models by better representing soil-vegetation interactions. Overall, this work provides a robust data-driven approach for estimating Zr globally, independent of process-based assumptions, and relevant for diverse ecohydrological applications striving towards more accurate characterizations of terrestrial water and carbon cycling.

Keywords: rooting depth, machine learning, soil vegetation interactions, global hydrology, ecohydrology, Earth system modeling

How to cite: Haghdoost, S., Cheng, S., Baez-Villanueva, O., and G. Miralles, D.: Global Rooting Depth Inferred based on Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12259, https://doi.org/10.5194/egusphere-egu26-12259, 2026.

A.33
|
EGU26-21276
|
ECS
Che-You Liu and Shao-Yiu Hsu

Understanding how rainfall is transformed into streamflow is a cornerstone of hydrological science. Despite decades of progress, it remains uncertain whether physical or semi-empirical process equations formulated at the field scale can be transferred to the catchment scale without loss of realism. We assumed that this scale-mismatch is a key reason why conventional conceptual/process-based models often fail to achieve simulation accuracy comparable to purely data-driven deep learning models. Motivated by ensemble rainfall–runoff analysis (ERRA), which suggests that streamflow can be expressed as a convolution between precipitation and a nonlinear catchment response function, we develop an LSTM-based framework to learn catchment-scale response functions for each hydrological process directly from data while retaining physically consistent structure.

The proposed framework couples a generic bucket model architecture with an LSTM that acts as a nexus optimizer. Physical consistency is enforced through residual-style loss regulation, embedding mass-conservation constraints within the training objective. Within this setting, key processes, including canopy interception, infiltration, evapotranspiration, river routing, and groundwater recharge, emerge as extractable functions of meteorological forcing sequences rather than being prescribed a priori. We founded that the learned catchment-scale response functions exhibit pronounced nonlinearity and memory effects. Our results further indicate that catchment-scale process representations effectively mix field-scale empirical relationships with precipitation spatiotemporal heterogeneity, and that the deformation from field to catchment scale response function is strongly driven by the spatial heterogeneity of precipitation intensity. By restructuring the learning pathway to reduce recurrent dependencies, the framework supports efficient parallel training while maintaining physical consistency. The approach aims to simultaneously simulate streamflow and induce catchment scale response functions, offering a pathway to diagnose why conventional models fail and to advance process discovery via data-driven induction.

How to cite: Liu, C.-Y. and Hsu, S.-Y.: Deep Learning as a Nexus Optimizer: Extracting Hydrological Response functions for Rainfall-Runoff Simulation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21276, https://doi.org/10.5194/egusphere-egu26-21276, 2026.

A.34
|
EGU26-17878
Sadegh Kaboli, Ville Kankare, Cintia Bertacchi Uvo, Petteri Alho, Ali Torabi Haghighi, and Elina Kasvi

The timing of peak snowmelt floods in boreal environments has undergone significant changes, characterized by nonlinear and complex patterns. This timing determines when coastal areas of boreal rivers experience the greatest inundation during the spring season. It is highly sensitive to climate change and directly influences local fauna and flora. Despite its critical role in flood risk management, the prediction of spring flood timing, along with the identification of its key drivers and most influential factors, remains insufficiently studied in boreal regions.

In this study, we investigate the potential for predicting the timing of annual maximum snowmelt floods by applying a thermal definition of the spring season, along with various climatological and hydrological indices. The analysis is based on a comprehensive daily dataset available with varying record lengths of at least 50 years, available since the early 1960s and extending to 2023 across multiple unregulated Finnish catchments. Among the most important dynamic features are daily discharge records, high-resolution gridded temperature data, and atmospheric teleconnection indices. Additionally, key static catchment characteristics, such as area, slope, and geographical position, are also incorporated into the modeling process, along with other relevant variables.

Machine learning algorithms, including Random Forest and SHAP (SHapley Additive exPlanations) values for feature importance, are applied to identify the most influential factors shaping the timing of annual maximum snowmelt floods and to assess the overall predictability of these events across multiple catchments. The study introduces a novel approach using a thermal definition of spring. The findings provide new indices and actionable thresholds that can help identify areas where adaptation measures should be prioritized.

How to cite: Kaboli, S., Kankare, V., Bertacchi Uvo, C., Alho, P., Torabi Haghighi, A., and Kasvi, E.: Estimating the timing of the peak snowmelt floods in unregulated boreal catchments using machine learning techniques., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17878, https://doi.org/10.5194/egusphere-egu26-17878, 2026.

Posters virtual: Thu, 7 May, 14:00–18:00 | vPoster spot A

The posters scheduled for virtual presentation are given in a hybrid format for on-site presentation, followed by virtual discussions on Zoom. Attendees are asked to meet the authors during the scheduled presentation & discussion time for live video chats; onsite attendees are invited to visit the virtual poster sessions at the vPoster spots (equal to PICO spots). If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access the Zoom meeting appears just before the time block starts.
Discussion time: Thu, 7 May, 16:15–18:00
Display time: Thu, 7 May, 14:00–18:00

EGU26-6937 | Posters virtual | VPS10

Integrating deep learning and hydrological modelling to assess farm roadway runoff risk to inform targeted mitigation in grassland systems 

Lungile Senteni Sifundza, John G. Murnane, Karen Daly, Russell Adams, Patrick Tuohy, and Owen Fenton
Thu, 07 May, 14:21–14:24 (CEST)   vPoster spot A

Farm roadway networks are an important infrastructure in grassland farms providing access between the farmyard and grazing fields. However, during livestock movement, excreta is deposited on the roadways, especially on bends, T-junctions and at corners where their movement is impeded. Nutrient-enriched soiled runoff generated on these roadways can contribute significantly to water quality degradation if connected to waters (including man-made open drainage ditches). Quantifying the risk associated with farm roadway runoff delivery to waters includes mapping the roadway and drainage networks and identifying sections which contain high pollutant loads and have the potential of generating, mobilising and delivering surface runoff to the drainage channels. In this study, a deep learning (DL) approach was employed to automatically identify internal farm roadway networks and open drainage channels in 5 grassland farms. Aerial imagery and LiDAR-derived digital terrain models were used to train the DL models for identifying farm roadways and open drainage ditches, respectively. The flow direction and flow accumulation were determined using digital elevation models to map farm roadway sections that have the potential to generate and deliver runoff to the drainage network.

Across the 5 farms, a total of 16.7 km of roadway and 13.5 km of drainage channels were identified by the DL models, achieving precisions of 79 % and 64 %, and accuracies of 90 % and 96 %, respectively. Flow accumulation maps were established for each farm to assess delivery pathways and the potential of roadway runoff connectivity to waters. Flow pathways through roadway junctions and at corners were considered critical outranking those on straight roadway sections. Breaking the runoff pathway at these locations will help prevent delivery to waters. The findings of this study indicate that mapping of open drainage channels and internal farm roadways in grassland farms can be automated by using deep learning models. Integrating the automated mapping and hydrological modelling enables more precise identification of critical roadway sections, supporting targeted mitigation to reduce soiled runoff from entering waters and thus enhance water quality protection in grassland farming systems.

How to cite: Sifundza, L. S., Murnane, J. G., Daly, K., Adams, R., Tuohy, P., and Fenton, O.: Integrating deep learning and hydrological modelling to assess farm roadway runoff risk to inform targeted mitigation in grassland systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6937, https://doi.org/10.5194/egusphere-egu26-6937, 2026.

Please check your login data.