HS8.2.2 | Data-driven and hybrid groundwater modelling: methods, applications, and challenges
EDI
Data-driven and hybrid groundwater modelling: methods, applications, and challenges
Convener: Hector Aguilera | Co-conveners: Carolina Guardiola-Albert, Ezra HaafECSECS, Inga RetikeECSECS, Joel Podgorski, Julian Koch
Orals
| Tue, 05 May, 16:15–17:45 (CEST)
 
Room C, Wed, 06 May, 08:30–10:15 (CEST)
 
Room C
Posters on site
| Attendance Wed, 06 May, 10:45–12:30 (CEST) | Display Wed, 06 May, 08:30–12:30
 
Hall A
Orals |
Tue, 16:15
Wed, 10:45
The field of data-driven and hybrid groundwater modelling continues to gain significant momentum within the hydrological community, reflecting growing interest in machine learning, artificial intelligence, and approaches that integrate data-driven techniques with physical process understanding. These methods are increasingly essential for addressing complex challenges in groundwater quantity and quality forecasting, uncertainty quantification, and sustainable management under changing climatic and anthropogenic pressures. Data-driven approaches — including time-series models, statistical methods, machine and deep learning techniques, and emulators — are transforming how we study, manage and forecast groundwater systems. By learning directly from observations, remote sensing and other data sources, these methods can complement, accelerate or in some cases substitute detailed process-based models. Recent advances in physics-informed methods, spatio-temporal deep learning architectures, probabilistic machine learning and foundation-model approaches are rapidly expanding possibilities for groundwater science. We welcome novel methodological developments and practical applications addressing real-world groundwater management problems. Submissions may address (but are not limited to):
• Advanced data-driven techniques for predicting groundwater quantity and quality in space and/or time (ML/DL, statistical and time-series models).
• Hybrid approaches combining ML with physically based models, including Physics-Informed Neural Networks and surrogate modelling.
• ML-based emulation of numerical models to support efficient modelling and data assimilation.
• Uncertainty quantification and sensitivity analysis using probabilistic ML, quantile regression and deep ensembles.
• Explainable and interpretable ML to improve hydrogeological understanding.
• Methods for big and heterogeneous datasets, data fusion (satellite, models, in-situ), and solutions for data scarcity, non-stationarity and irregular time steps.
• Transferability and regionalization to ungauged sites, and foundation-model approaches for temporal and spatial extrapolation.
We especially encourage submissions that link methods to management-relevant outcomes, climate-change impact assessments, adaptation strategies, and practical case studies. Join us to share research at the intersection of data science and groundwater hydrology and to advance this dynamic field through knowledge exchange and discussion.

Orals: Tue, 5 May, 16:15–08:30 | Room C

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Carolina Guardiola-Albert, Julian Koch, Ezra Haaf
Hybrid modelling for groundwater quantity and quality
16:15–16:20
16:20–16:30
|
EGU26-105
|
ECS
|
On-site presentation
Prem Chand Muraharirao, Phanindra kbvn, Carlos Minutti-Martinez, Walter A Illman, and Chandramouli Sangamreddi

We develop a novel, physics-informed multi-task learning framework (PI-XNET) for steady-state hydraulic tomography in fractured aquifers. The model employs a SegNet-based encoder-decoder architecture with feature fusion to jointly reconstruct hydraulic conductivity (K) and the fracture network. The residuals of the governing partial differential equations (PDEs) are incorporated into the model to integrate the groundwater flow dynamics and enforce physical constraints. The unified loss combines data mismatch residuals, PDE constraints, and hard constraint loss, with each component weighted based on the task uncertainty. Through synthetic experiments, we evaluate the performance of PI-XNET and its robustness to data noise, reduced pumping datasets, and data resolution. In comparison to the standard multi-task learning network (RMSEmedian= 1.27, median R2k= 0.73), PI-XNET (RMSEmedian= 1.11, median R2k= 0.78) has improved the conductivity reconstruction and achieved higher fracture segmentation accuracy (ACCmedian>99%). Moreover, PI-XNET consistently achieved higher accuracy in hydraulic head reproducibility (median R2h= 0.61, median L1 norm = 0.19 m, median RMSEh = 0.14 m2). With fewer pumping test data and with data noise, the performance of PI-XNET declines modestly yet remains reliable. With coarser data resolution, head predictions remain robust (median R2h = 0.82), whereas K and fracture mapping deteriorated with increased fracture complexity. Our results demonstrate that incorporating physics constraints within an uncertainty-weighted, multi-task framework improves the parameter estimation and fracture mapping and achieves high accuracy even with reduced pumping data. Further, we emphasize that the reliability of PI-XNET in realistic fractured geologic settings depends on data quality and resolution.

How to cite: Muraharirao, P. C., kbvn, P., Minutti-Martinez, C., Illman, W. A., and Sangamreddi, C.: Physics-informed multi-task neural networks for joint mapping of fracture network and hydraulic conductivity in fractured aquifers: PI-XNET, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-105, https://doi.org/10.5194/egusphere-egu26-105, 2026.

16:30–16:40
|
EGU26-729
|
On-site presentation
uğur boyraz and Hayri Baycan

Groundwater quality is increasingly threatened by urban, agricultural, and industrial pressures, many of which introduce persistent pollutants into aquifers. Reliable prediction of solute migration toward surface water bodies is therefore critical for sustainable water‐resources management. This study investigates groundwater contamination dynamics by integrating an analytical groundwater-flow solution with a numerical advection–dispersion model and machine learning (ML). The objective is to improve predictive capability for contaminant arrival timing in stream–aquifer systems while reducing the computational burden associated with repeated physical simulations. This work contributes to the growing field of hybrid, data-driven groundwater modelling by demonstrating how machine-learning surrogates can efficiently emulate computationally intensive contaminant-transport simulations.

A hybrid computational framework was developed in which groundwater flow was solved analytically to obtain the spatial distribution of hydraulic heads and the corresponding stream–aquifer interaction fluxes. These analytically derived velocities along the stream boundary were then used as inputs to an explicit finite-difference solution of the advection–dispersion equation (ADE) for an instantaneous point-source release. The aquifer domain was discretized into a 40×40 grid, and Darcy velocities along the (0, y) interface were multiplied by local solute concentrations to obtain spatially distributed mass fluxes. Numerical integration (trapezoidal rule) yielded the total mass discharged into the river as a function of time. The time at which this discharge reached its maximum was extracted and used as the ML target variable. To explore a wide range of hydrogeological behaviors, a synthetic dataset was generated by sampling physically meaningful parameter ranges, including streambed slope, river length, aquifer width, longitudinal and transverse dispersivities, molecular diffusion, hydraulic conductivity, and initial particle positions. A total of 1200 analytical–numerical realizations were generated and partitioned into training and verification subsets to enable unbiased ML evaluation. All realizations were simulated using a uniform grid resolution to maintain numerical consistency across varying aquifer geometries. Preprocessing involved eliminating variables that did not influence arrival timing, such as total contaminant mass. Spearman correlation analysis and physics-based reasoning indicated that the transverse-dispersivity multiplier and molecular diffusion coefficient contributed negligibly to the target variable and were removed. Physics-informed feature engineering was then applied to strengthen predictor–response relationships, producing composite variables such as hydraulic-gradient proxies, dimensionless spatial coordinates, transmissivity-like ratios, and domain geometry indicators. After removing outliers via the IQR method and applying a logarithmic transformation to the target variable, a CatBoostRegressor model was optimized through Bayesian hyperparameter search. Model evaluation using R², RMSE, MAE, MAPE, and PBIAS demonstrated strong predictive skill with minimal bias (such as R² = 0.9367). These results indicate that the analytical–numerical–ML framework offers a computationally efficient alternative to repeated contaminant-transport simulations and reliably estimates contaminant-arrival timing across a wide spectrum of hydrogeologic settings.

How to cite: boyraz, U. and Baycan, H.: Rapid Prediction of Contaminant Arrival Times in Stream–Aquifer Systems Using Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-729, https://doi.org/10.5194/egusphere-egu26-729, 2026.

16:40–16:50
|
EGU26-9346
|
On-site presentation
Timo Houben, Christian Siebert, and Sabine Attinger

The sustainable management of groundwater resources faces significant challenges in light of progressing climate changes with decreasing groundwater recharge, as well as the associated increase in withdrawals. At several observation wells in Germany in particular, declining groundwater levels are being observed, highlighting the need for robust forecasting methods to assess both the long-term availability of water resources and the vulnerability and resilience of aquifers.

Using spectral analysis of groundwater level fluctuations, hydrogeological parameters such as transmissivity, storativity, and the characteristic response time can be derived from time series of groundwater levels and recharge in the frequency domain. The characteristic response time serves as a measure of an aquifer’s resilience to droughts, enabling the classification of groundwater monitoring wells and their respective aquifers, while the derived transmissivity and storativity can be used for transient groundwater modelling.

Thus, the presented methodological workflow allows a rapid assessment of hydrogeological properties as well as the application of these parameters in numerical and analytical models for predicting groundwater levels under changing climatic conditions.

How to cite: Houben, T., Siebert, C., and Attinger, S.: Spectral Analysis of groundwater level time series: Hydrogeological parameter estimation for groundwater modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9346, https://doi.org/10.5194/egusphere-egu26-9346, 2026.

16:50–17:00
|
EGU26-10303
|
On-site presentation
Pascal Audigane, Etienne Lehembre, Hugo Breuillard, Thi-Bich-Hanh Dao, Vincent Nguyen, and Christel Vrain

Accurate simulation of groundwater level dynamics remains a major challenge due to the complex interplay between climatic forcing, subsurface properties, and hydrological processes. In this study, we propose a hybrid modeling approach that combines data-driven neural networks with physically based constraints to reproduce piezometric time series of the Beauce limestone aquifer, located in the Centre–Val de Loire region (France). This aquifer has been monitored for several decades and benefits from an extensive observation network. Twelve piezometers were selected to represent the diversity of groundwater responses, including systems characterized by strong inertial behavior.

The neural network is trained by minimizing a cost function measuring the mismatch between observed and simulated groundwater levels. To enhance training convergence and predictive skill, the cost function is augmented with a term derived from physical processes governing groundwater evolution. These processes are based on the inter-reservoir drainage laws implemented in the global hydrological model Gardenia (©BRGM). Gardenia conceptualizes the transfer of water from precipitation to the aquifer through three reservoirs: soil, unsaturated zone, and saturated zone. Infiltration is controlled by the square of soil saturation, effective rainfall is partitioned between runoff and percolation following an exponential law defined by a half-life parameter and a partitioning factor, and groundwater discharge to the river is described by an exponential recession law governed by a distinct half-life.

The proposed architecture combines a Long Short-Term Memory (LSTM) network with a Multi-Layer Perceptron (MLP), allowing the model to exploit both the temporal dependency structure of hydrological time series and the nonlinear representation capacity of feedforward neural networks. Results show that incorporating physical constraints into the learning process significantly improves both training stability and predictive performance compared to a purely data-driven approach. Finally, the hybrid model performances are compared with those of the Gardenia model, highlighting the added value of combining physical understanding with machine learning for groundwater level simulation.

How to cite: Audigane, P., Lehembre, E., Breuillard, H., Dao, T.-B.-H., Nguyen, V., and Vrain, C.: Integrating physical constraints into neural networks for piezometric time series modeling: application to the Beauce limestone aquiferAbstract:, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10303, https://doi.org/10.5194/egusphere-egu26-10303, 2026.

17:00–17:10
|
EGU26-13775
|
On-site presentation
Matthew Arran, Kirsty Upton, Christopher Jackson, Setareh Nagheli, and Benjamin Marchant

With changes in climate and water demand placing increasing pressure on UK groundwater resources, water companies need to be able to rapidly and reliably simulate a wide range of scenarios for precipitation, evapotranspiration, and borehole abstraction. But models derived purely from historical data are unreliable in changing conditions, while physics-based groundwater models require time and expertise to run. Here, we show that a Recursive-Neural-Network-based emulator of a physics-based model can make predictions that are both rapid and reliable, giving water companies a tool for both operational decision-making and long-term planning. We discuss the practical importance of representative training data, user-friendly interfaces, and clear uncertainty communication. Finally, we indicate the broader applicability of our work.

How to cite: Arran, M., Upton, K., Jackson, C., Nagheli, S., and Marchant, B.: Recursive neural networks for application-focussed emulation of groundwater models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13775, https://doi.org/10.5194/egusphere-egu26-13775, 2026.

17:10–17:20
|
EGU26-19215
|
ECS
|
On-site presentation
Shreyansh Mishra, Lisa Baulon, and Augustin Thomas

Understanding how groundwater levels respond to hydroclimatic forcings and human activities is essential for sustainable groundwater management yet remains challenging in many regions due to incomplete pumping records, heterogeneous datasets, and non-stationary system behavior. In this study, we explore a hybrid, data-driven framework to disentangle climate-driven and anthropogenic influences on long-term groundwater head time series using monitored piezometric networks. We first apply transfer function–noise (TFN) modelling, implemented through the Pastas framework, to simulate groundwater head dynamics as a response to observed hydroclimatic forcings, including precipitation, evapotranspiration, and river stage. The resulting model residuals exhibit structured, behaviours that cannot be attributed to random noise alone, suggesting the presence of missing processes or stresses not explicitly represented in the model. The results show that the Pastas model is able to optimize the parameters of response functions of the recharge while separating out other drivers of the groundwater head. We then analyse residual patterntemporal shifts, and spatial coherence across multiple wells to assess the residual patterns and classify them between unresolved natural processes and anthropogenic stresses. To this end, wdeploy unsupervised anomaly detection algorithms (Isolation Forest) on these residuals to automatically classify the temporal schedule of pumping events without prior labelling. This work demonstrates how interpretable time-series models and data-driven learning can be combined to reduce uncertainty, improve process understanding, and extract management-relevant information from groundwater monitoring data under data-scarce conditions. 

How to cite: Mishra, S., Baulon, L., and Thomas, A.: Deciphering the dependencies of piezometric signals on hydroclimatic and anthropogenic forcings using time-series models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19215, https://doi.org/10.5194/egusphere-egu26-19215, 2026.

17:20–17:30
|
EGU26-15889
|
ECS
|
On-site presentation
Subhajit Dey and Scott K Hensen

Identifying the source and release history of groundwater contaminants is a crucial task, as removal operations largely depend on these factors. Link Simulation Optimisation (LSO) is a proven method for identifying the history of the source for the groundwater contaminants in inverse problem matrices. In a conventional LSO optimisation algorithm, the simulation algorithm is encapsulated within it. The optimisation algorithm drives the search, whereas the simulation model responds. The main strong point of the LSO is how the tandem optimisation algorithm and simulation model work. However, simulation models often increase the computational burden; as a result, they are replaced with a surrogate model.

Historically, statistical surrogates such as Polynomial Response Surface, Gaussian Process Regression, and radial basis function have been used in the groundwater source release history problem. More recently, machine–learning–based surrogates, such as Artificial Neural Networks, Deep Neural Networks, and Convolutional Neural Networks, have been extensively used in groundwater source release history problems. The main drawbacks of these surrogates are that they don’t consider physics within their training process. As a result, they are heavily dependent on the training data. Outside information beyond their training data often relates to poor performance. Moreover, the source identification of groundwater problems is ill-posed, but surrogates often smooth the objective landscape and provide a false sense of uniqueness. Additionally, a noise level of 1% to 2% in the training data typically results in a significant error in prediction within the LSO framework.

To overcome the aforementioned drawback, we propose a surrogate based on a physics-informed neural network (PINN) in an LSO framework for identifying the source contamination strength in a hypothetical case scenario. The hypothetical scenario is homogeneous and governed by the Dirichlet boundary condition. The proposed PINN learns the spatio-temporal contaminant concentration C(x,y,z,t) by minimising errors in observed data while simultaneously enforcing the 3D advection–dispersion equation, boundary conditions, and initial conditions. The contaminant source is represented as a time-limited mass-loading well (active until 0 to t_on), embedded directly into the governing PDE, ensuring physically consistent transport and mass conservation. Apart from conventional practices, validation is performed during the training process, which provides advantages in avoiding overfitting and retaining the most effective features. This 3D-PINN tested 1000 data points generated using random uniform, Sobol, and Latin hypercube sampling.  Results show that with the correct implementation of the PINN, we can estimate the source strength C(x, y, z, t) with greater accuracy. In this LSO model, we utilise simulated annealing as the optimisation method.

How to cite: Dey, S. and Hensen, S. K.: Solving 3D Inverse Groundwater Source History Problems in a LSO Framework Using Physics-Informed Neural Networks and Simulated Annealing  in Homogeneous Aquifers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15889, https://doi.org/10.5194/egusphere-egu26-15889, 2026.

17:30–17:40
|
EGU26-14206
|
On-site presentation
Mathias Busk Nielsen, Troels Norvin Vilhelmsen, Rasmus Bødker Madsen, and Thomas Mejer Hansen

Groundwater management in Danish municipalities relies heavily on decision-support from numerical flow simulations to evaluate the impact of drinking‑water abstraction on the surrounding environment and the risk of contamination. We would require that this decision-support is as informative as possible and should therefore consider uncertainty in model input data. However, most applied groundwater models are deterministic, built on a single geological interpretation and a fixed set of hydraulic parameters. Such models provide only a single outcome drawn from a whole distribution of possible outcomes, blinding decision-makers to potential unforeseen environmental risks. Fully propagating geological and hydrological uncertainty in these models is necessary to explore all possible outcomes, but this comes at the cost of computationally expensive simulations infeasible to perform within everyday administrative workflows.

To address this challenge, we present an approach that utilizes artificial neural networks trained on simulated results from a stochastic model ensemble to emulate the computationally heavy numerical models. The approach constructs an ensemble of groundwater models of the same location with stochastic geology and hydrological layer properties. Simulations run in the model ensemble using MODFLOW present the full outcome space as a distribution instead of a single value. We perform forward particle tracking in the ensemble to delineate probabilistic catchment areas of abstraction wells. The probabilistic catchment areas are used as target data for the neural network to learn from along with a selection of input features. Applied to the Egebjerg catchment, Denmark, the neural network produces catchment probabilities with high accuracy compared to MODFLOW while reducing computation time from hours to seconds.

The achieved reduction in computation time makes the neural network suitable within a decision-support tool enabling the use of stochastic models in practice and improving the decision-making process of administrative groundwater management.

How to cite: Nielsen, M. B., Vilhelmsen, T. N., Madsen, R. B., and Hansen, T. M.: Probabilistic decision-support for groundwater management made feasible through artificial neural networks , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14206, https://doi.org/10.5194/egusphere-egu26-14206, 2026.

17:40–17:45

Orals: Wed, 6 May, 08:30–10:15 | Room C

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Hector Aguilera, Inga Retike, Joel Podgorski
Data-driven techniques for groundwater quantity and quality assessment
08:30–08:35
08:35–08:45
|
EGU26-2374
|
On-site presentation
Wolfgang Nowak, Waqas Ahmed, and Emanuel Buccini

In deep learning, as in any other modeling endeavor, the quality and scale of the data used are important. To use the prediction of groundwater levels as an example, satellite data have gained considerable attention for monitoring groundwater storage anomalies. However, such data have a coarse resolution and uncertainties due to the disintegration process. At the same time, the lack of sufficiently dense groundwater monitoring networks also remains a significant barrier. In many real-world applications, high-quality data are rare. Thus, input and target (calibration) data are often noisy, spatially/temporally sparse, and lack spatial resolution, which compromises the predictive power of deep learning models.

In this work, we investigate the robustness of deep learning for estimating groundwater levels at the continental scale from sparse observations. We utilize the CONUS2 dataset (https://hydroframe.org/parflow-conus2), derived from the physics-based simulator ParFlow. Inspired by a recent study (HydroStartML [1]), we utilize this dataset to train a deep learning model that estimates the “depth to water table” (DTWT) from easily accessible and spatially distributed covariates. These covariates include elevation, slope, and hydrographic features such as hydraulic conductivity and net recharge.

As a deep-learning model, we implement a U-Net architecture to map the relationship between these covariate maps and WTD. Beyond a baseline estimation, where we use the entire CONUS2 data set for training, we conduct a rigorous ablation to evaluate the model's robustness under simulated data scarcity, reflecting real-world observational constraints. To simulate data scarcity, we apply a masking protocol, where we systematically occlude a wide range of data fractions from the target data, thus forcing the U-Net model to reconstruct the WTD field from limited information. Finally, we assess model performance using standard metrics, such as the Nash-Sutcliffe Efficiency and the Root Mean Squared Error. Our results demonstrate a strong predictive capability, even in data-sparse scenarios, thereby validating the approach. However, a spatial analysis of the error distribution reveals a distinct topographical dichotomy: while the network achieves high precision and stability in low-relief plains, it exhibits systematic errors in complex mountainous terrain, where the prediction task is more challenging due to the larger spatial variability of covariates and the target variable. Overall, our findings suggest that, while U-Net architectures are surprisingly robust for groundwater mapping, distinct physical scenarios may require adaptations to the architecture.

 

[1] L. Pawusch, S. Scheurer, W. Nowak, R.M. Maxwell: “HydroStartML: A combined machine learning and physics-based approach to reduce hydrological model spin-up time”, Advances in Water Resources, 206, 2025. https://doi.org/10.1016/j.advwatres.2025.105124

How to cite: Nowak, W., Ahmed, W., and Buccini, E.: How far can we stretch big-data ideas with limited data? Machine-learned groundwater level predictions at a continental scale with smaller and smaller data sets., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2374, https://doi.org/10.5194/egusphere-egu26-2374, 2026.

08:45–08:55
|
EGU26-9072
|
On-site presentation
Maria Wetzel, Kunz Stefan, Doll Fabienne, Habbel Bastian, Liesch Tanja, and Broda Stefan

Groundwater systems exhibit substantially different response dynamics depending on site-specific and hydrogeological characteristics: While shallow aquifers often respond rapidly to meteorological forcing, deeper groundwater systems typically show delayed and strongly damped dynamics. In particular, monitoring wells with large depths to groundwater and long response times to external drivers remain challenging to model reliably using data-driven approaches. However, these systems are hydrogeologically highly relevant, as their substantial storage capacity and persistence across wet and dry periods strongly influence long-term water availability and the attenuation of climatic extremes.

To assess the potential of data-driven models for capturing contrasting groundwater dynamics, groundwater level time series from approximately 400 monitoring wells in the federal state of Brandenburg (Germany) are selected. All wells provide continuous observations since 1980, are distributed across three major aquifer complexes at different depths, and thus represent a wide spectrum of response behaviours. Recurrent neural networks (Gated Recurrent Units - GRU) are applied to predict groundwater levels based on meteorological inputs (precipitation and air temperature). Two key aspects are systematically investigated: (1) the length of the input sequence and (2) the optional integration of aggregated meteorological predictors. This design evaluates whether extended look-back periods or the incorporation of site-specific smoothed climate signals improves the predictability of damped groundwater systems.

The results indicate that input sequence lengths of two to three years substantially improve model performance for slow-responding groundwater systems, whereas shorter sequences are sufficient for more dynamic systems. Incorporating site-specific aggregated meteorological inputs further enhanced the representation of characteristic response times and led to a considerable increase in predictive skill for slow-responding aquifers. Although some strongly damped systems remain difficult to predict even with optimised model configurations, the overall results demonstrate a clear potential to better capture slow-responding groundwater dynamics and improve predictive performance.

How to cite: Wetzel, M., Stefan, K., Fabienne, D., Bastian, H., Tanja, L., and Stefan, B.: Data-driven modelling of groundwater level time series: challenges posed by contrasting response dynamics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9072, https://doi.org/10.5194/egusphere-egu26-9072, 2026.

08:55–09:05
|
EGU26-11041
|
ECS
|
On-site presentation
Asmae Ez-zahy, Nicolas Massei, Abderrahim Jardani, Lisa Baulon, Hugo Breuillard, Augustin Thomas, and Sivarama Krishna Reddy Chidepudi

Groundwater levels integrate the combined effects of climate variability, surface–subsurface interactions, and anthropogenic activities. Capturing their temporal dynamics across diverse hydrogeological settings and varying degrees of human influence remains a major scientific challenge, particularly in regions where physical descriptors and anthropogenic forcing data are scarce or uncertain. This study investigates whether a single deep learning model can generalize groundwater level simulations across a large number of hydrogeologically contrasted monitoring stations in metropolitan France. The proposed framework relies on long-term time series of groundwater levels and meteorological forcings (precipitation and temperature), collected from the French groundwater  monitoring network and meteo-france SAFRAN reanalysis. Climate-driven groundwater dynamics is first learned from meteorological inputs only, and the architecture of the Deep Learning model is subsequently extended to account for anthropogenic influences by incorporating groundwater pumping data where available, despite their sparse and uneven spatial coverage. This strategy enables the integration of human-induced forcing while maintaining consistency with climate-driven groundwater behavior under heterogeneous spatio-temporal water abstraction data availability. The results show the ability of the proposed framework to reproduce temporal groundwater dynamics across a wide range of hydrogeological contexts and degrees of anthropogenic influence. They also highlight the relevance of the approach for developing scenarios of regional-scale groundwater evolution under changes in climate conditions and water uses.

How to cite: Ez-zahy, A., Massei, N., Jardani, A., Baulon, L., Breuillard, H., Thomas, A., and Chidepudi, S. K. R.: A New Global Deep Learning Framework to Generalize Groundwater Simulation Across Hydrogeological Diversity and Anthropogenic Influence, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11041, https://doi.org/10.5194/egusphere-egu26-11041, 2026.

09:05–09:15
|
EGU26-11009
|
Highlight
|
On-site presentation
Mariana Gomez, Hector Aguilera, Julian Koch, Augustin Thomas, Georgina Arno, Montse Colomer, Peter van der Keur, and Stefan Broda

In order to manage groundwater effectively at the pan-European scale, it is essential to treat groundwater as a shared, transboundary resource and to assess it consistently from in-situ observations. In the course of the Geological Service for Europe (GSEU) project, we compiled groundwater-level records from participating national and regional surveys. These records were subsequently harmonized into the European Groundwater Monitoring (EUGM) database. Groundwater level time series encompass diverse data sampling intervals, record lengths, continuity, and quality, frequently exhibiting non-uniform spatial coverage. The data undergoes a quality assurance process comprising the detection of stagnation periods, outlier screening, and the imputation of missing values. After the quality assurance process, the EUGM release contains 12,797 groundwater-level time series, of which 2,654 are designated as near-real-time (NRT) monitoring points, with expected monthly updates across 11 European countries.

Focusing on NRT stations, we plan to develop an operational forecasting system for monthly groundwater levels using a single LSTM model trained at sites with a minimum of 20 years of observations. Meteorological predictors include precipitation, air temperature, relative humidity, and standardized precipitation indices (SPI) from ERA5-Land. The modelling period varies by site according to record length and, with the earliest possible start being 1950 in order to align with ERA5-Land availability. Results are intended for integration into the European Geological Data Infrastructure (EGDI) platform with the objective of enabling Europe-wide access, comparison, and operational use. The EUGM thereby provides a consistent observational base for cross-border assessment, modelling, and forecasting of groundwater dynamics.

How to cite: Gomez, M., Aguilera, H., Koch, J., Thomas, A., Arno, G., Colomer, M., van der Keur, P., and Broda, S.: An Operational Groundwater-Level Forecasting System for Europe, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11009, https://doi.org/10.5194/egusphere-egu26-11009, 2026.

09:15–09:25
|
EGU26-20010
|
On-site presentation
Sai Jagadeesh Gaddam and Sindhura Chittimireddy

Groundwater assessment in India makes extensive use of publicly available datasets, including long-term monitoring well records, hydrogeological maps, and derived spatial products. However, accessing and interpreting these datasets typically involves multiple software tools, manual data extraction, and specialist expertise in GIS and hydrogeology, limiting their usability for broader user groups.

GWFlowAI is a location-based Artificial Intelligence (AI) framework designed to enable exploratory groundwater analysis through a single user input step: the entry of an address or geographic location. The system performs automated geocoding and spatial buffering to identify relevant administrative units and hydrogeological features. Public groundwater datasets are retrieved and harmonized using standardized coordinate reference systems and metadata-aware preprocessing pipelines.

The framework follows an agent-based architecture, in which specialized Artificial Intelligence agents are responsible for tasks such as data retrieval, geospatial processing, time-series analysis, and result interpretation. Time-series analysis methods are used for groundwater level trend detection, including statistical smoothing and change-point identification, while spatial analysis methods such as interpolation and zonal statistics are applied to characterize regional groundwater conditions. AI agents assist with workflow orchestration, analytical query interpretation, and generation of human-readable summaries, while core numerical and geospatial computations remain explicit and reproducible.

GWFlowAI is intended to support multiple user groups, including researchers, consultants, planners, students, and policy practitioners, enabling consistent access to the same public datasets at varying analytical depths. To the authors’ knowledge, this represents an early effort in India to provide a single-entry, integrated Artificial Intelligence (AI) workflow for groundwater data exploration based entirely on public datasets. The paper presents the system architecture, analytical methods, and representative outputs, and discusses limitations related to data resolution, uncertainty, and spatial coverage.

How to cite: Gaddam, S. J. and Chittimireddy, S.: GWFlowAI: A One-Step, Location-Based Artificial Intelligence (AI) Framework for Exploring Public Groundwater Datasets in India, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20010, https://doi.org/10.5194/egusphere-egu26-20010, 2026.

09:25–09:35
|
EGU26-9355
|
On-site presentation
Jonas Sundell and Ezra Haaf

The valley of the River Göta Älv in southwest Sweden is highly prone to landslides due to the presence of underlaying clay deposits. Landslides occur when driving forces and moments exceed resisting forces and moments, and this balance can be altered by elevated pore-water pressures thereby reducing effective stress. Pore pressure varies with climatic drivers (precipitation and evapotranspiration) and with changes in external loading, such as river stage fluctuations.

Here we apply the impulse–response modelling framework Pastas to calibrate, validate, and simulate pore-pressure time series from piezometers installed along the Göta Älv valley. Since 2019, pore pressure has been monitored at multiple locations and depths. Model calibration used daily precipitation, temperature, potential evapotranspiration (PET), and river level time series. In total, 127 pore-pressure series were modelled using multiple combinations of impulse response functions and evapotranspiration formulations. Based on calibration and validation performance, 58 series were deemed suitable for climate-driven simulations. The most common causes of unsatisfactory model performance were (i) failure to reproduce rapid responses, (ii) threshold-like behaviour leading to underestimation of extreme high levels, (iii) short records limiting representation of inter-annual variability, (iv) shifted dynamics from the calibration to the validation period, and (v) potential outliers related to initialization of pressure sensors, measurement errors, or gas intrusion in instruments.

The acceptable models were forced using the CMIP5 ensemble of EURO-CORDEX regional climate model (RCM) simulations for 1970–2100 (daily precipitation and temperature). Extreme high pore-pressure levels were quantified as 100-year return levels using a generalized extreme value (GEV) distribution under RCP8.5. For most series, the projected median change in the 100-year return level by 2100 is <0.1 m relative to 1970–2010, while a small subset shows increases up to ~0.3 m (excluding outliers). Considering the 95th percentile of the projected change in the 100-year return level, most series remain <0.4 m, with a small subset reaching up to ~1.3 m. These point-scale changes in extreme pore pressure may increase landslide susceptibility. The results enable slope-scale landslide probability assessment by upscaling piezometer-scale return levels to a three-dimensional slope geometry.

The presentation will highlight (i) the use of data-driven impulse–response modelling for pore-pressure time series (to our knowledge not previously applied in this context), (ii) key challenges in obtaining robust calibrations and validations, and (iii) scenario-based projections and extreme-value analysis under a changed climate.

How to cite: Sundell, J. and Haaf, E.: Modelling extreme pore-water pressures in clay under climate change for landslide risk assessment: the Göta Älv river valley, Sweden, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9355, https://doi.org/10.5194/egusphere-egu26-9355, 2026.

09:35–09:45
|
EGU26-15057
|
On-site presentation
Hayat Ghachoui, Abdelhalim Tabit, Ahmed Algouti, and Said Moujane

Groundwater stands as a vital buffer against the growing impacts of climate change, especially in arid and semi-arid regions where surface water is ephemeral and rainfall patterns are becoming increasingly erratic. Understanding how recharge zones respond to climatic variability is crucial for ensuring long-term water security. This study provides a basin-scale assessment of groundwater discharge potential by integrating field measurements, geospatial predictors and supervised machine-learning techniques. A dataset of 239 boreholes with measured discharge (L s⁻¹) from 2015-2025 was compiled to identify high-potential sites. Sixteen conditioning factors representing topography, hydrology, climate, vegetation, land use and structural characteristics were generated from remote-sensing products, DEM-derived indices and thematic datasets. After evaluating multicollinearity through correlation analysis and variance inflation factors, four single classifiers,Random Forest, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM) and k-Nearest Neighbours (KNN), were developed, along with several hybrid ensembles and a four-model stacking configuration.

All models reveal a consistent spatial pattern. Favourable zones trace a continuous corridor along the main Drâa Valley, from the upstream sectors around Aguim and Taznakht through Ouarzazate and Agdz to the Zagora plains, with frequent extensions across adjacent piedmonts and alluvial surfaces. Around half of the basin is classified as favourable (class 1), underscoring the central geomorphological role of this valley system in concentrating infiltration and sustaining groundwater discharge. Among the single models, LightGBM shows the strongest performance (accuracy = 0.941; ROC_AUC = 0.985; LogLoss = 0.166; Brier score = 0.046). The four-model ensemble achieves an accuracy of 0.943 and an MCC of 0.885, with very low probability errors in independent validation. Elevation, soil moisture, drainage density, precipitation and NDWI are consistently identified as the most influential predictors. Overall, the proposed framework offers a robust decision support tool for guiding drilling, managed aquifer recharge and the protection of key groundwater corridors in one of Morocco’s most water stressed regions.

How to cite: Ghachoui, H., Tabit, A., Algouti, A., and Moujane, S.: Mapping groundwater discharge potential from Earth observation data using machine learning: Evidence from an arid basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15057, https://doi.org/10.5194/egusphere-egu26-15057, 2026.

09:45–09:55
|
EGU26-16963
|
ECS
|
On-site presentation
Akhilesh S. Nair, Markus Giese, and Lena M. Tallaksen

Detecting and attributing groundwater droughts requires models that both capture complex temporal memory and provide interpretable representations of drivers. We present a framework that models weekly Standardised Groundwater Index (SGI) at multiple wells in Norway and Sweden using a Long Short-Term Memory (LSTM) network in combination with SHapley Additive exPlanations (SHAP) to attribute drought drivers. The LSTM ingests weekly climate and meteorological predictors (e.g., precipitation, temperature) together with large-scale teleconnection indices (e.g. NAO) to learn nonlinear, lagged responses that govern groundwater anomalies. We configure relatively deep LSTM network to represent long-range dependencies controlling weekly to seasonal anomalies and apply light regularisation to preserve natural SGI variability and avoid suppression of seasonal peaks. SHAP is used post-hoc to quantify feature importance and the timing and sign of impacts on predicted SGI at both aggregated and event specific scales. This allows identifying which predictors and lag times drive rapid groundwater decline or recovery, how teleconnection phases modulate drought risk, and the spatial heterogeneity of dominant drivers across wells. The primary objective of the LSTM–SHAP framework is to deliver local, well-specific attribution across the study region, complemented by spatial maps that identify the dominant controlling features (e.g., summer precipitation or winter snow). The results demonstrate that the integrated LSTM–SHAP approach produces accurate weekly SGI estimates for monitoring purposes while providing attribution of drought drivers. This capability supports early warning, and enhances understanding of hydroclimatic influences on groundwater droughts.

How to cite: Nair, A. S., Giese, M., and Tallaksen, L. M.: Groundwater drought attribution in Norway and Sweden using interpretable LSTM models of the Standardised Groundwater Index , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16963, https://doi.org/10.5194/egusphere-egu26-16963, 2026.

09:55–10:05
|
EGU26-20586
|
ECS
|
On-site presentation
Manuel Rodríguez del Rosario, Víctor Gómez-Escalonilla, África de la Hera-Portillo, and Pedro Martínez-Santos

Advances in machine learning offer new opportunities to enhance the assessment and regional-scale mapping of groundwater nitrate contamination, a long-standing and widespread environmental problem. This study illustrates this potential in the Duero River Basin (Spain), where nitrate concentrations in aquifers have increased steadily over recent decades due to intensive agricultural and livestock farming. Machine learning techniques are applied to predict groundwater nitrate pollution using monitoring data combined with spatially derived environmental and anthropogenic predictors, framing the problem as a binary classification task based on a threshold concentration of 37.5 mg/L. Several tree-based ensemble algorithms were evaluated, with Random Forest selected due to its superior predictive performance and robustness. Model reliability was ensured through a repeated nested cross-validation strategy, resulting in an ensemble of 50 models and the generation of out-of-fold probability estimates. Model performance was evaluated using metrics tailored to imbalanced datasets and focused on the minority class, including the F1-score and the Area Under the Precision–Recall Curve. A temporal analysis based on different hydrological years was conducted to assess the persistence and spatial variability of nitrate pollution risk over time. Spatial validity and model reliability were further evaluated by comparing predicted risk patterns with officially designated nitrate vulnerable zones (NVZs). This comparison revealed a high degree of agreement, while also identifying areas outside current NVZs boundaries exhibiting similar contamination characteristics, suggesting the presence of potentially unrecognised nitrate pollution risks. Model interpretability was explored using SHAP values, which highlighted precipitation, diffuse agricultural pressures, distance to surface water bodies, NDVI, and soil properties as the most influential predictors of nitrate contamination. Overall, the results demonstrate the value of interpretable machine learning approaches for improving the assessment, understanding, and management of groundwater nitrate pollution at the basin scale.

How to cite: Rodríguez del Rosario, M., Gómez-Escalonilla, V., de la Hera-Portillo, Á., and Martínez-Santos, P.: Reliable and interpretable machine learning for groundwater nitrate pollution mapping: the Duero River Basin (Spain), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20586, https://doi.org/10.5194/egusphere-egu26-20586, 2026.

10:05–10:15
|
EGU26-5889
|
On-site presentation
Denitza Voutchkova, Lars Troldborg, Lærke Thorling, Christian F. Damgaard, and Peter B. Sørensen

Accurate national-scale modelling of surface water concentrations of trace elements requires accounting for both anthropogenic and geogenic inputs. In Denmark, groundwater concentrations of geogenic elements show pronounced spatial variability, making the natural groundwater component a potentially important contributor to the variability of surface water quality. However, quantifying groundwater–surface water interactions and groundwater-derived geogenic element concentrations at large spatial scales remains challenging due to limited data availability, model resolution constraints, and conceptual uncertainty.

This study aims to estimate the potential input of groundwater concentrations of selected geogenic elements (As, Ba, Cd, Cr, Cu, Ni, Pb, and Zn) to surface waters across Denmark. The resulting concentration estimates are intended as inputs to a nationwide surface water model. Therefore, the target spatial unit is the ID15 catchment, the smallest unit used in Danish national water management, representing topographic catchments with an average area of 15 km² (n = 3351).

To enable national-scale application, the complex three-dimensional groundwater–surface water system was simplified into a hierarchical structure with three levels: (1) well-screens, (2) groundwater bodies, and (3) ID15 catchments. Groundwater chemistry observations were available at the well-screen level. Well-screens were assigned to groundwater bodies with hydraulic contact to streams and lakes and thus feeding water into individual ID15 catchments.

We applied a hierarchical mixed-effects modelling framework to estimate typical (latent) concentrations of geogenic elements in groundwater bodies at depths relevant for groundwater–surface water contact. The model quantified the influence of hydrogeochemical and geological factors while accounting for spatial grouping within groundwater bodies and repeated measurements at individual well-screens. Each element was modelled separately, and concentrations were log-transformed prior to analysis.

The expected latent log-concentrations were described using a linear predictor with fixed and random effects. Fixed effects included redox class, pH class, geology, and depth, with interaction terms between redox and pH. Random effects were specified for groundwater bodies and well-screens. Measurements reported below detection limits were treated as left-censored observations and incorporated directly into the likelihood using cumulative log-normal probabilities.

Model parameter values including standard errors were estimated by minimising the joint negative log-likelihood using the R software package RTMB (R Template Model Builder), which is a high-performance statistical modelling tool. Model selection was based on Akaike’s Information Criterion. The model predicted latent groundwater concentrations at the depth assumed representative for groundwater–surface water contact: 3 m for groundwater bodies and 1 m for shallow near-surface groundwater not part of a groundwater body. Predicted log-concentrations were then aggregated to derive typical groundwater concentration inputs for each ID15 catchment.

We present a hierarchical modelling framework for estimating depth-dependent geogenic element concentrations at the groundwater body and ID15 catchment scales, enabling national-scale integration with surface water models, while interpreting and contextualising key model parameters and discussing limitations and future directions.

How to cite: Voutchkova, D., Troldborg, L., Thorling, L., Damgaard, C. F., and Sørensen, P. B.: National-scale modelling of spatially heterogeneous groundwater concentrations of selected geogenic elements to predict local-scale concentration-inputs to surface waters in Denmark, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5889, https://doi.org/10.5194/egusphere-egu26-5889, 2026.

Posters on site: Wed, 6 May, 10:45–12:30 | Hall A

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Wed, 6 May, 08:30–12:30
Chairpersons: Carolina Guardiola-Albert, Ezra Haaf, Joel Podgorski
Posters for HS8.2.2: Data-driven and hybrid groundwater modelling: methods, applications, and challenges
A.103
|
EGU26-20548
Hector Aguilera, Víctor Gómez-Escalonilla, Eva García Tricás, Olga García Menéndez, África de la Hera-Portillo, Manuel Rodríguez del Rosario, and Pedro Martínez-Santos

Accurate, high-resolution estimation of groundwater storage changes (GWSC) is critical for sustainable water management, particularly in semi-arid basins facing increasing climatic and anthropogenic pressures. Traditional process-based hydrogeological models often fall short due to computational constraints, coarse resolution, and poor performance in data-scarce regions. This study presents an innovative data-driven surrogate modelling framework that overcomes these limitations by fusing large-scale model outputs with local observations to generate reliable, high-resolution GWSC estimates.

We demonstrate the framework in the Duero Basin as a pilot site, an 80,000 km2 basin in central Spain. The methodology involves a multi-step hybrid data conditioning process. First, total water storage (TWS) outputs from the Terrestrial Systems Modelling Platform (TSMP, 11 km) are corrected and downscaled using in-situ groundwater level (GWL) observations via spatiotemporal kriging. This generates a corrected GWSC target variable with an explicit pixel-level uncertainty flag (low, moderate, high). This conditioned dataset then trains a state-of-the-art Spatiotemporal Transformer (STT) deep learning model, designed to capture complex spatiotemporal dependencies. The STT uses 48 months of historical data to forecast GWSC 12 months ahead, incorporating static (e.g., geology, land use, socio-economic) and dynamic (precipitation, potential evapotranspiration, temperature) features. An uncertainty-aware training scheme uses the uncertainty flags both as an input feature and to weight the loss function. A key architectural innovation is the implementation of a "late fusion" concatenation strategy, which enhances spatial awareness. A learned coordinate embedding, generated by a small Multi-Layer Perceptron (MLP) from geographic coordinates, is concatenated to the transformer's final layer outputs before prediction. This allows the model to learn and correct for persistent, location-specific biases (e.g., systematic differences between southeastern and northwestern aquifer dynamics) without disrupting the core temporal attention mechanisms, thereby stabilizing training and improving regional accuracy. The STT’s performance is benchmarked against an XGBoost model and combined into an optimal linear ensemble.

Results show the ensemble model achieves robust performance, with a train, validation and test R2 of 0.82, 0.46 and 0.44, respectively, outperforming individual models. Spatial analysis reveals that predictive skill is highest in areas where data conditioning yielded low uncertainty. Feature importance analysis ranks precipitation, evapotranspiration, and water demands as the most influential predictors. The framework successfully generates spatially explicit maps of GWSC and associated uncertainty across the basin.

This study concludes that integrating process-model outputs with local observations through geostatistical conditioning provides a viable pathway for creating reliable training data for deep learning surrogates. The proposed STT-based framework offers a scalable, computationally efficient alternative to traditional models for operational groundwater monitoring and forecasting. Its modular design ensures transferability to other basins, marking a significant step towards improving groundwater resource management in data-scarce and hydrogeologically complex regions worldwide.

How to cite: Aguilera, H., Gómez-Escalonilla, V., García Tricás, E., García Menéndez, O., de la Hera-Portillo, Á., Rodríguez del Rosario, M., and Martínez-Santos, P.: A Deep Learning surrogate for groundwater storage change prediction at regional scale (Duero River Basin, Spain), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20548, https://doi.org/10.5194/egusphere-egu26-20548, 2026.

A.104
|
EGU26-18571
|
Virtual presentation
Leticia Baena-Ruiz, David Pulido-Velazquez, Antonio-Juan Collados-Lara, Juan de Dios Gómez-Gómez, Héctor Aguilera, Miguel Mejías, and Juan Grima

Groundwater resources are essential to ensure future water security, especially in semi-arid areas such as the Mediterranean region. Groundwater level forecasting allows to predict the availability of the resource under different scenarios (including potential future climate change scenarios), although sometimes there is not enough monitoring data to develop distributed models. Some approaches such as lumped models and/or artificial intelligence algorithms have been demonstrated to provide satisfactory results by using a reduced amount of data.

In this work, we analyse the impact of some sources of uncertainty in the generation of local future groundwater level forecasts by using lumped models and artificial neural networks (ANN). The climate uncertainty is constrained from specific warming scenarios by removing the projections coming from inferior models (from multi-criteria analyses) taking into account their availability to reproduce historical climate statistics. A stochastic weather generator was used to generate multiple series of exogenous variables, which will allow to perform a stochastic forecast. The structural uncertainty related with the propagation of hydrological impact of ensembled climatic series is analysed by simulating with different lumped and ANN models.

The lumped models were calibrated through an automatic procedure. We also applied a sensitivity analysis in order to adjust the range of some hydrogeological parameters. Multiple configurations of ANN (approaches, number of neurons and delays) and exogenous variables were tested to select the best experiments by considering the mean value of MSE.

We analyse the climatic and structural uncertainty for short-term forecasting using both modelling approaches. We also analyse the long-term uncertainty by simulating with lumped models. The generation of stochastic predictions will be explored, by applying the Monte Carlo Method from the simulation with multiple selected models with good performance indicators.

The methodology was applied to the Campo de Montiel aquifer in central Spain, an area where groundwater and surface water are closely interconnected, with recognized Natural Park and Ramsar site such as Lagunas de Ruidera wetland, but also an intensive groundwater extraction due to the agricultural demand. This aquifer is essential as strategic water reserve under drought periods in a semi-arid climatic context.

The results have been also compared with those obtained with MODFLOW, showing the differences between distributed vs lumped approaches (sensitivity of the results to the spatial resolution of the methods).

 

Funding: This research was partially funded by the project SIGLO-PRO (PID2021- 128021OB - I00/ AEI / https://doi.org/10.13039/501100011033/FEDER,UE), from the Spanish Ministry of Science, Innovation and Universities.

How to cite: Baena-Ruiz, L., Pulido-Velazquez, D., Collados-Lara, A.-J., Gómez-Gómez, J. D. D., Aguilera, H., Mejías, M., and Grima, J.: Short-term and long-term uncertainty analysis in groundwater level forecasting, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18571, https://doi.org/10.5194/egusphere-egu26-18571, 2026.

A.105
|
EGU26-522
|
ECS
Sharon Lee Vizcarra Mondragon, Anna Jurado Elices, Estanislao Pujades Garnes, and Nafiseh Salehi Siavashani

Understanding the interactions between climate variability and groundwater quality remains a major challenge in arid and semi-arid regions, where aquifers are increasingly affected by altered recharge regimes and more frequent droughts. The aims of these study are to: (i) investigate how climate and groundwater quality linkages and, (ii) propose a data-driven machine learning framework to predict hydrochemical parameters in a pilot catchment of the Upper Guadiana Basin (Spain).

Daily climatic variables (maximum and minimum temperature and precipitation) from the Spanish Meteorological Agency were compiled together with hydrochemical data collected between 2001 and 2021, including electrical conductivity, pH, dissolved oxygen, and major ions (HCO₃⁻, Cl⁻, NO₃⁻, SO₄²⁻, Na⁺, Mg²⁺, Ca²⁺) measured at ten groundwater sampling points from Guadiana River Basin authority.

The proposed methodology analyzes the variability, correlations, and long-term behaviour of the hydrochemical dataset in order to identify which parameters respond most clearly to climate conditions. Subsequently, a climate driven predictive component is constructed using multivariate regression models based on ensemble methods, such as Random Forest. The climate predictors obtained from this step allows the estimation of each hydrochemical variable. This workflow allows both datasets to be integrated in a coherent way despite their different temporal resolutions, while keeping the influence of climate on groundwater quality interpretable.

The resulting data-driven framework will support the prediction of groundwater quality parameters and the assessment of aquifer sensitivity under contrasting climate scenarios. Beyond its local application, the methodology offers a transferable and efficient approach for groundwater management in regions facing increasing climate stress and contributes to the climate change impact assessments and practical decision-support tools.

Keywords: groundwater quality, climate variability, machine learning, upper Guadiana basin.

How to cite: Vizcarra Mondragon, S. L., Jurado Elices, A., Pujades Garnes, E., and Salehi Siavashani, N.: Machine Learning framework for groundwater quality prediction in the upper Guadiana basin under climate variability, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-522, https://doi.org/10.5194/egusphere-egu26-522, 2026.

A.106
|
EGU26-1806
|
ECS
Waqas Ahmed, Ahsan Qasam Khan, and Wolfgang Nowak

High-fidelity groundwater (GW) models are powerful tools for simulating complex subsurface processes and predicting groundwater levels with high accuracy—provided that high-quality input data is available. However, in many real-world applications, such high-quality data is rare. Input data are often noisy, sparse, and lack spatial resolution, which compromises the predictive power of these models. This presents a fundamental challenge: while the high-fidelity model is available, its application is limited by the low quality of the data typically encountered in operational settings. While physics-based simulations can help overcome the issue of data scarcity by generating synthetic training datasets, they do not address the issue of poor data quality—particularly the lack of spatial resolution in key inputs such as hydraulic conductivity (K), net recharge (R = N–ET), and pumping rates. These inputs should ideally be spatially distributed, but in practice, they are often poorly resolved or only available as point measurements. This raises a critical question: Should we deliberately degrade high-quality synthetic data during training to match the expected quality of application data? Or can we develop a surrogate model that is inherently robust to the data quality gap?

We propose the latter: a novel approach that trains a deep learning model to be aware of and compensate for residuals that occur due to a lack of input fidelity. The presented method tightly integrates the UNET deep learning architecture, physics-based MODFLOW model, and Gaussian process regression models into a hybrid training and prediction pipeline for building a residual-aware surrogate model. We tested this modelling approach on a study area in Germany, where we generated multi-fidelity training datasets with the MODLFOW 2005 simulator by varying the fidelity of the input permeability field. The presented hybrid approach is suitable for surrogating models where multi-fidelity models are available, but inference is only required for low-fidelity inputs.

How to cite: Ahmed, W., Khan, A. Q., and Nowak, W.: Making surrogates robust against model misspecification: A residual-aware combination of Gaussian processes and U-Net architectures. , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1806, https://doi.org/10.5194/egusphere-egu26-1806, 2026.

A.107
|
EGU26-2803
Sheng-Wei Wang and Wen-Chi Chen

Accurate prediction of groundwater level variations remains challenging in intensively exploited aquifers, particularly where recharge processes operate at daily scales while pumping activities are recorded at coarser temporal resolutions. Conventional data-driven models often struggle to reconcile these mismatched time scales and provide limited physical interpretability. This study aims to develop an interpretable, multi-scale machine learning framework that explicitly separates recharge-driven dynamics from pumping-induced impacts, thereby facilitating both predictive performance and enhanced hydrological insight. A two-stage, multi-scale modeling framework is proposed for a catchment-scale groundwater monitoring network consisting of nine monitoring wells. Daily groundwater levels and rainfall data were used alongside monthly electricity consumption records from surrounding pumping wells, disaggregated by pumping purpose. In Stage A, a monthly-scale model was constructed to capture long-term groundwater trends driven by aggregated rainfall and pumping intensity. Monthly groundwater levels were modeled using gradient boosting with rainfall sums, purpose-specific pumping electricity consumption, and optional autoregressive terms. Out-of-fold (OOF) predictions were generated using a five-fold time-series cross-validation scheme, and monthly predictions were subsequently upsampled to daily resolution. In Stage B, daily-scale residuals were defined as the difference between observed groundwater levels and Stage A monthly predictions. A residual learning model was then developed to represent short-term recharge responses using daily autoregressive information (with a 7-day lag), cumulative rainfall indices (7- and 14-day sums), and antecedent dry-day counts. To enhance robustness against extreme fluctuations, a pseudo-Huber loss function was adopted within an XGBoost regression framework. A small nested time-series grid search was employed to tune key hyperparameters, thereby balancing model stability and the risk of overfitting. Model performance was evaluated using OOF predictions across all wells. Interpretability was assessed through SHAP value analysis, rainfall event-aligned composite response analysis, and lag-to-peak diagnostics. Additional scenario-based comparisons were conducted to contrast observed responses, no-pumping counterfactual predictions, and simulations that included pumping. The proposed multi-scale framework achieved stable and physically consistent groundwater level predictions across the monitoring network. Stage B residual modeling substantially improved daily-scale performance relative to autoregressive-only baselines, particularly during recharge events. SHAP analysis confirmed that short-term rainfall accumulation and antecedent wetness were the dominant drivers of residual groundwater responses, while autoregressive terms captured local memory effects. Event-aligned composite analyses revealed heterogeneous lag-to-peak responses among wells, reflecting spatial variability in hydrogeological connectivity and the influence of pumping. While incorporating pumping information improved monthly trend representation in Stage A, scenario comparisons indicated that pumping effects on event-scale dynamics were well-separated from recharge-driven responses. The pseudo-Huber loss function provided marginal but consistent gains in robustness, particularly for wells exhibiting heavy-tailed residual behavior, without compromising interpretability. This study demonstrates that a multi-scale, residual-based machine learning framework can effectively reconcile disparate temporal resolutions in groundwater datasets while preserving hydrological interpretability. By explicitly decoupling long-term pumping impacts from short-term recharge dynamics, the proposed approach provides a transparent and extensible foundation for groundwater management applications. The framework is well-suited for exploratory analysis and international knowledge exchange, with future work focusing on refined representations of pumping and extended scenario-based assessments.

How to cite: Wang, S.-W. and Chen, W.-C.: A multi-scale machine learning framework for interpretable groundwater level prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2803, https://doi.org/10.5194/egusphere-egu26-2803, 2026.

A.108
|
EGU26-6184
Gyu Hyun Han, Kyoung-Ho Kim, and Sung-Wook Jeen

Seawater intrusion is a groundwater salinization process caused by seawater influx into coastal aquifers, and its severity has increased due to sea-level rise associated with climate change and excessive groundwater extraction. Previous studies on seawater intrusion detection have primarily relied on chemical indicators or fixed threshold values; however, these approaches have limitations in accounting for interactions among multivariate water quality data and variations in hydrogeochemical characteristics. This study aimed to develop a model for detecting seawater intrusion-affected samples using machine learning and deep learning techniques. In this study, data from the National Groundwater Monitoring Network in Korea were collected to define the background characteristics of domestic freshwater groundwater. The dataset consisted of 16 variables, including 13 original water quality parameters (electrical conductivity (EC), Na, Mg, K, Ca, Cl, SO4, HCO3, pH, Fe, Mn, NO3 and dissolved oxygen (DO)) and 3 derived variables reflecting the geochemical characteristics of seawater intrusion (Na/Cl ratio, Ca/Mg ratio, and Base Exchange Index; BEX). These data were used to train a Variational Autoencoder (VAE), a deep learning-based generative model, which compressed the data into a 4-dimensional latent space. To quantify the degree of differentiation from freshwater according to seawater mixing ratios, synthetic data were generated by coupling PHREEQC with R to incorporate key geochemical reaction mechanisms associated with seawater intrusion, including cation exchange reactions during seawater-freshwater mixing. Anomaly detection techniques were then applied to evaluate detection performance. The results demonstrated that samples could be distinguished from the freshwater distribution even at low seawater mixing ratios, suggesting the potential for determining minimum detectable contamination levels for seawater intrusion monitoring. This study presents a novel approach for seawater intrusion detection based on machine learning and deep learning, and is expected to contribute to early detection of seawater intrusion and sustainable management of coastal groundwater resources.

How to cite: Han, G. H., Kim, K.-H., and Jeen, S.-W.: Deep Learning-Based Detection of Seawater Intrusion Using Multivariate Hydrogeochemical Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6184, https://doi.org/10.5194/egusphere-egu26-6184, 2026.

A.109
|
EGU26-6926
|
ECS
Niladri Chowdhury, Patrick Morrissey, and Laurence Gill

The research was carried out at the Manorhamilton karst spring in County Leitrim, northwest Ireland, a representative site of the region’s Carboniferous limestone terrain, notable for its complex subsurface flow networks and rapid hydrological dynamics. The study sought to simulate spring discharge using five years of hydrological observations and to evaluate ten distinct modeling methods. These included a physically based numerical pipe network model built in InfoWorks ICM 2025.1, three neural network (NN) models—Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Nonlinear Autoregressive with Exogenous Inputs (NARX) and six hybrid numerical–NN configurations. Two hybridization strategies were tested: Residual Error Correction (REC) and Sequential Combination (SC). Results revealed that all NN models surpassed the standalone numerical model in reproducing karst spring discharge time series. Among the hybrids, the LSTM+PN+SC model achieved the highest accuracy, stability, and generalization across various flow regimes. These outcomes underscore the advantages of integrating physical process knowledge with deep learning approaches for modelling intricate karst hydrological systems. The study also outlines the strengths and limitations of applying different NN architectures and hybrid methods for groundwater management and prediction in comparable Irish karst settings.

How to cite: Chowdhury, N., Morrissey, P., and Gill, L.: A comparative analysis of numerical, neural network, and hybrid modelling techniques for simulating karst spring discharge based on long-term hydrological records., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6926, https://doi.org/10.5194/egusphere-egu26-6926, 2026.

A.110
|
EGU26-7056
|
ECS
Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

In hydrology, deep learning (DL) models have already achieved remarkable breakthroughs in predicting streamflow. These DL models are fed with meteorological time-series and static catchment attributes across large samples of catchments, and predict streamflow remarkably well in both gauged and ungauged situations. In recent years, some studies have transferred the idea of constructing multi-basin/station DL models – particularly Long-Short Term Memory (LSTM) neural networks – to large-sample groundwater level modelling to explore their potential for temporal and spatial extrapolation. To the best of authors’ knowledge, existing multi-station LSTM applications are limited to three, covering 76 climate-sensitive stations in Northern France, 108 nationwide stations in Germany, and 1,800 coastal stations across nine countries/regions. Notably, spatial generalisation was investigated solely in the German study, which suggests that the model utilised static features primarily as ‘unique identifiers’ to memorise local behaviour rather than deriving the generalisable hydrological insights required for spatial extrapolation. Given the limited number of studies and the potentially biased datasets, the generalisation ability of multi-station DL models for groundwater level modelling is still under exploration.

A newly released large-sample groundwater dataset by the Environment Agency of England, comprising more than 200,000 daily and 200 million sub-daily sampling observations for over 3,400 wells, offers a unique opportunity to test the generalisation ability of multi-station DL models in time and space, and whether these models can yield process-relevant insights on groundwater dynamic mechanisms. In this study, we want to investigate the following questions:

  • 1) How well can multi-station DL models simulate the groundwater variability across England?
  • 2) Which input features does the DL model use to make its predictions (especially in places where it does well)?

How to cite: Fang, Q., Rahman, M., Wagener, T., and Pianosi, F.: Exploring the Generalisation Ability of Deep Learning Models for Large-Sample Groundwater Level Predictions across Space and Time, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7056, https://doi.org/10.5194/egusphere-egu26-7056, 2026.

A.111
|
EGU26-9218
|
ECS
Elnaz Bayat Khajeh, Saeed Samadianfard, and Orhan Gündüz

The correct estimation of groundwater levels (GWLs) is important for the sustainable management of water resources, especially in arid or semi-arid regions like Tabriz plain aquifer located in northwest of Iran, where the need for freshwater is increasing and climate variability puts more stress on the aquifer. Thus, this study introduces a novel framework that combines time-series modeling with deep learning, along with clustering, to improve GWLs estimation in heterogeneous aquifers.

All monitoring wells within the aquifer were classified into five clusters using the k-means method to deal with spatial heterogeneity. The basis for clustering included two standards: (i) characteristics of groundwater behavior and (ii) characteristics of hydro-environmental variables associated with each cluster. The results of the clustering were used to develop models of GWLs for each cluster, thereby minimizing spatial variability and increasing the predictive capability of the models. The initial base model is the Long Short-Term Memory (LSTM) model, which is combined with a Double Moving Average (DMA) technique to improve model performance. Therefore, a DMA-LSTM hybrid model is developed to combine temporal smoothing with deep learning methodology.

Model inputs include precipitation, temperature, normalized difference vegetation index, groundwater extraction, and previous lagged GWLs data during 2011-2024 time period. The input of climatic, vegetation, anthropogenic, and groundwater lagged data into the model enabled it to show both natural aquifer recharge processes and the impacts of human activities simultaneously. Monthly GWLs were considered as the Target Output of all 5 Clusters.

The model evaluation across all clusters indicates that both LSTM and DMA-LSTM models are able to predict GWLs with high accuracy, as indicated by Coefficients of Determination (R²) values greater than 0.97 for Clusters 1 and 2. The combination of the DMA and LSTM showed improvements in prediction accuracy based on a smaller Root Mean Square Error (RMSE) for all clusters such that RMSE reduced from 0.0396 to 0.0303 in cluster 1 and from 0.0988 to 0.0659 in cluster 5. Additionally, Clusters exhibiting a higher degree of variability (i.e., Cluster 3 & 5) demonstrated a significant reduction in the higher temporal fluctuation with RMSE reductions greater than 30% (from 0.1249 to 0.0853 in Cluster 3 and from 0.0988 to 0.0659 in Cluster 5), indicating the advantage of combining DMA with deep learning for GWLs prediction in more variable clusters.

Accordingly, the results revealed that combining LSTM and DMA improves the predictive performance of the LSTM while preserving the capabilities of deep learning models. The developed model is a strong and efficient method for groundwater monitoring and management, as it can be applied to regions experiencing similar climatic, hydrological, and anthropogenic pressures.

Keywords: Groundwater level prediction, Hybrid deep learning, LSTM, DMA, K-means clustering

How to cite: Bayat Khajeh, E., Samadianfard, S., and Gündüz, O.: Improving Groundwater Level Prediction Using a Cluster-Based Hybrid LSTM Approach, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9218, https://doi.org/10.5194/egusphere-egu26-9218, 2026.

A.112
|
EGU26-9876
|
ECS
Yan Zhu, Zhi Dou, Chaoqi Wang, Meng Chen, Yun Yang, and Jinguo Wang

Groundwater contamination source identification (GCSI) is critical for water resources management but depends on the accurate characterization of aquifer parameters, especially hydraulic conductivity (K). A novel multimodal direct forward machine learning (MDFML) model was developed to simultaneously predict GCSI parameters and reconstruct K-fields. This model utilizes constrained residual fusion to integrate temporal concentration and spatial head data, and improve complementarity. Tested on synthetic Gaussian and non-Gaussian aquifers, MDFML consistently outperformed single-modal models. In Gaussian fields, MDFML improved source parameter prediction by 2.20% (R²) and K-field reconstruction by 7.50% (SSIM, structural similarity index) compared to single-modal baselines. In non-Gaussian fields, structured dispersion patterns achieved higher K-field reconstruction (SSIM=0.951, +6.70% vs. 0.892 for Gaussian), but nonlinearity lowered source prediction accuracy (R²=0.900, -2.75% vs. 0.925 for Gaussian). These results demonstrate the robustness and reliability of MDFML under complex hydrogeological conditions and provide an efficient solution for accurate GCSI and sustainable groundwater remediation.

How to cite: Zhu, Y., Dou, Z., Wang, C., Chen, M., Yang, Y., and Wang, J.: Simultaneous Identification of a Contamination Source and Hydraulic Conductivity Based on a Multimodal Direct Forward Machine Learning Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9876, https://doi.org/10.5194/egusphere-egu26-9876, 2026.

A.113
|
EGU26-9892
|
ECS
Sami Miaari, Daniel Klotz, and Stefan Kollet

Global change poses significant challenges to water resource management. Water Table Depth (WTD) is a critical variable linking subsurface dynamics with land surface processes. Local observations support WTD monitoring; however, observations are sparse and unevenly distributed. Transfer learning applications are a solution for WTD monitoring; however, prediction quality assessment is challenging in locations without observations. Consequently, a generalized, scalable, and transferable WTD monitoring framework with a prediction quality assessment tool is essential for such locations.

This contribution explores a method to estimate prediction quality for transfer learning applications from observed grid cells to target locations without observations. Specifically, we implement an ensemble approach with 100 Long Short-Term Memory (LSTM) networks to predict WTD. By leveraging the ensemble spread, we develop spread-skill relationships (which measure the ability of the ensemble uncertainty to predict accuracy) to assess prediction quality in target locations.

We use the Terrestrial Systems Modelling Platform Ground to Atmosphere (TSMP-G2A) dataset from 2001 to 2020, and spatially split it into training and transfer sets. Each ensemble LSTM member was trained on a spatial subset of the training set from 2001 to 2015, randomly sampled based on geographic location. We also evaluated the local prediction performance on the training grid cells over the testing period from 2016 to 2020. The transfer learning performance was assessed on the transfer set, with the same testing period but different locations from the training set. The spread-skill relationships were explored between ensemble spread and performance metrics on the transfer set.

Results indicate good generalization and transfer abilities. Additionally, expanding the spatial size of each ensemble member’s training subset from 100 to 400 grid cells leads to a global improvement in transfer prediction accuracy by 11.52% and 17.42% in RMSE and Pearson correlation, respectively. The spread-skill relationships show a strong correlation between ensemble spread (ensemble variance and interquartile range) and both RMSE and absolute mean bias, demonstrating a potential effective estimation of prediction quality for certain performance metrics without the need for observations. In contrast, the ensemble spread exhibits a weak relationship with Pearson correlation.

The study highlights the potential of transfer learning to improve hydrological modeling and provides an assessment tool for prediction quality, particularly in regions lacking observations. These findings demonstrate the feasibility of scalable data-driven groundwater prediction and suggest that future research could extend this framework to evaluate its transferability and performance at the global scale.

How to cite: Miaari, S., Klotz, D., and Kollet, S.: Application of Ensemble LSTM Transfer Learning for Water Table Depth Prediction and Uncertainty Assessment in Data-Scarce Regions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9892, https://doi.org/10.5194/egusphere-egu26-9892, 2026.

A.114
|
EGU26-10493
|
ECS
Simon Paasch, Timo Houben, Thomas Ohnemus, and Hannes Mollenhauer

As climate change alters precipitation patterns, regional water governance and the agricultural sector face a critical challenge: transitioning from reactive crisis management to proactive water allocation. In recent years, regional authorities in Germany have increasingly been forced to issue water extraction bans to protect groundwater and surface water resources. However, such restrictions rely on water budget assessments derived from point-scale water level observations. While large-scale integrated hydrological models provide essential insights into long-term trends, their coarse spatial resolution often fails to accurately predict point-scale groundwater levels, creating a resolution gap for local decision-makers who require site-specific information for regulatory and operational purposes.

We present a Proof of Concept for an operational groundwater level forecasting framework designed to bridge this gap between large-scale modeling and local application. This approach focuses on the integration of existing, openly available data, combining historical local groundwater observations with large-scale recharge data from the integrated ParFlow hydrologic model. By applying a hybrid methodology—utilizing Fourier-based time-series analysis coupled with a simplified 2D groundwater table model —we demonstrate how large-scale model outputs can be downscaled into point-scale information.

Currently, the operational pipeline has been implemented for selected groundwater gauges in Saxony, featuring automated data ingestion and processing. We showcase the potential of this framework to provide agricultural stakeholders and water authorities with lead time needed for informed management decisions. Future developments will focus on expanding the gauge network, implementing a GIS-based interface for spatial visualization, and potentially integrating thresholds for groundwater extraction bans to increase regulatory predictability. By utilizing established scientific methods and data, this work provides a blueprint for transferring hydrological outputs into actionable information for stakeholders in regional water management and agriculture.

How to cite: Paasch, S., Houben, T., Ohnemus, T., and Mollenhauer, H.: From Large-scale Hydrological Models to Local Action: A Framework for Operational Groundwater Level Forecasting at the Point Scale, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10493, https://doi.org/10.5194/egusphere-egu26-10493, 2026.

A.115
|
EGU26-10663
|
ECS
Aatish Anshuman and Manish Panigrahi

Scarcity of groundwater level (GWL) data poses a significant challenge to effective groundwater resource modeling, particularly in urban and peri-urban regions where anthropogenic influences further complicate hydrological processes. In this study, a machine learning–based framework is developed to predict groundwater levels for Bhubaneswar city, India, using Long Short-Term Memory (LSTM) neural networks. Given the data-driven nature of machine learning models and the limited availability of long-term observations, a regionalized modeling approach is adopted by coalescing GWL measurements from 31 closely located monitoring wells. To enable the model to capture well-specific variability, each well is characterized using static indicators derived from hydrological and socio-environmental datasets. Multiple combinations of predictor variables are evaluated to identify those most effective in representing groundwater level dynamics. The optimal model, trained on aggregated regional data, demonstrates strong predictive performance during testing, with a correlation coefficient (R) of 0.89 and a Nash–Sutcliffe Efficiency (NSE) of 0.79. The proposed regionalized LSTM framework shows promise for reliable groundwater level prediction at individual wells in data-scarce urban settings, offering a practical tool for groundwater assessment and management.

How to cite: Anshuman, A. and Panigrahi, M.: Groundwater Level Prediction in Urban Areas under Data Scarcity Using a Regionalized LSTM Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10663, https://doi.org/10.5194/egusphere-egu26-10663, 2026.

A.116
|
EGU26-10721
|
ECS
Nimisha Anna George, Magdalena Scheck Wenderoth, Mauro Cacace, Denise Degen, Ritske S Huismans, and Elco Luijendijk

The subsurface of North-Central Europe has been shaped by repeated glaciations, which have altered pore pressure, temperature, and groundwater flow over geological timescales. Understanding these coupled thermo-hydraulic responses is essential for groundwater management, geothermal energy utilization, and subsurface storage applications such as CO₂ sequestration.

In this study, we investigate the thermo-hydraulic response of a multi-layered subsurface model of North-Central Europe over a time period the last glacial maximum up to present-day, while considering the thermal and hydraulic feedback of the glacier dynamics on the distribution in time and space of pore pressure and temperature.Independent hydraulic and thermal simulations are constructed, followed by controlled in which porosity and permeability are systematically varied within selected stratigraphic units while all other parameters are held constant. The resulting transient pore pressure and temperature fields are analyzed to assess the relative roles of conductive and advective heat transport and to identify formation-specific controls on pressure dissipation and thermal redistribution.

Based on the generated simulation ensemble, we further explore the development of physics-preserving, interpretable AI-assisted surrogate models and parameter estimation. These surrogates aim to efficiently reproduce key thermo-hydraulic responses while retaining physical consistency, thereby enabling rapid sensitivity analysis and uncertainty quantification in large-scale subsurface systems.

How to cite: George, N. A., Wenderoth, M. S., Cacace, M., Degen, D., Huismans, R. S., and Luijendijk, E.: Thermo-hydraulic sensitivity of the North-Central European subsurface under glacial forcing, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10721, https://doi.org/10.5194/egusphere-egu26-10721, 2026.

A.117
|
EGU26-13119
Steffen Birk

Groundwater monitoring networks are often particularly well established in areas where groundwater resources are exploited. Groundwater withdrawals in such areas potentially affect the observed groundwater levels. Models that are aimed at simulating groundwater levels therefore need to account for temporarily varying groundwater abstraction rates. Unfortunately, such information is rarely available, particularly in agricultural areas where multiple users extract groundwater for irrigation, depending on the seasonally varying crop water requirements. The objective of this work is to develop and test an approach for incorporating unreported irrigation withdrawals into data-driven time-series models of groundwater levels. To this end, irrigation is integrated into a lumped-parameter model coupling a root-zone model to a linear storage representing the groundwater body. While the irrigation demand is determined from the root-zone model, the groundwater abstraction needed to meet this demand is considered in the linear storage. The model is tested for the case of the shallow Seewinkel aquifer (Austria), which is almost exclusively used for irrigation. The model calibration yields a time-varying rate of groundwater abstraction. When compared with available irrigation estimates, the average groundwater abstraction obtained from the model is reasonable. The model suggests that the depletion of groundwater levels resulting from the groundwater abstraction for irrigation varies depending on the hydrological conditions. For example, an observation well where the depletion amounted to around one metre in wet years exhibited a depletion of around two metres in the years following the European drought 2003.

How to cite: Birk, S.: Integration of unreported irrigation withdrawals in time-series models of groundwater levels (Seewinkel, Austria), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13119, https://doi.org/10.5194/egusphere-egu26-13119, 2026.

A.118
|
EGU26-13153
|
ECS
Florian Lam, Simon Damm, Asja Fischer, and Thomas Heinze

Quantifying groundwater recharge remains a central challenge in hydrology, particularly in the context of climate variability and water resource management. Lysimeter measurements provide direct estimates of recharge but are spatially sparse and costly to maintain. In this study we train and evaluate long short-term memory (LSTM) networks on high-resolution Lysimeter data of multiple decades to predict seepage fluxes based on precipitation, temperature, and related meteorological  features. LSTM architectures are well-suited to capture the delayed and nonlinear nature of recharge processes, where precipitation may influence measurable seepage weeks or months later. The selection of meteorological features is guided by well-established empirical relations. We use feature importance to investigate the relevance of meterological input parameters on the model prediction and to guide the design of a compact neural network using the fewest possible input features to simplify future data acquisition. We envision the replacement of Lysimeters by trained neural networks as soft-sensors.

Our results highlight key limitations, particularly the need for sufficiently large datasets and the degradation of model performance in the presence of data gaps. Nevertheless, machine learning shows promise for extrapolating recharge dynamics in data-sparse regions if trained appropriately. This work contributes to the growing discourse on integrating physical understanding with data-driven methods to support groundwater assessments.

How to cite: Lam, F., Damm, S., Fischer, A., and Heinze, T.: Using LSTM and Metrological Time Series to forecast Lysimeter leachate, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13153, https://doi.org/10.5194/egusphere-egu26-13153, 2026.

A.119
|
EGU26-19031
|
ECS
Zhen Chen, Guang-Yi Chen, Meng-Sin Shih, and Li-Chiu Chang

Under a changing climate, shifts in the spatiotemporal patterns of rainfall can markedly influence catchment-scale hydrological processes and alter interactions between surface water and groundwater systems. Groundwater level is a key indicator of basin hydrological status, jointly controlled by rainfall infiltration, river recharge, pumping, and aquifer properties. Yet, at longer horizons, the nonlinear coupled relationships among rainfall, river discharge, and groundwater levels remain challenging to model and forecast, especially at the regional (multi-site) scale. The Zhuoshui River Basin, a critical water-supply and agricultural region in Taiwan, provides a representative setting to investigate these surface water–groundwater interactions.
Here we develop a long-horizon groundwater-level forecasting model at the 10-day (dekadal) scale based on the Transformer architecture for approximately 22 monitoring wells across the basin. The model is trained and optimised using historical hydrometeorological time series, with inputs including rainfall, river discharge, groundwater levels, and other key hydrological drivers. By leveraging the Transformer's attention mechanism, the proposed approach captures long-range dependencies in multivariate sequences and enables attribution analyses of dominant drivers influencing groundwater responses across lead times.
The model achieves strong predictive skill over the multi-site system (test RMSE = 0.23 m; R² = 0.95), demonstrating its capability to reproduce basin-wide groundwater dynamics at the dekadal scale. Attention weight analyses reveal how rainfall and river-flow signals propagate into groundwater variability across different time lags and spatial locations, deepening understanding of surface water–groundwater coupling mechanisms in the basin.
The developed forecasting framework provides actionable information for integrated water resources management under changing climatic and anthropogenic pressures, including early warnings for groundwater depletion risks, optimized conjunctive use strategies, and informed agricultural irrigation planning. By explicitly modeling multivariate hydrological interactions through attention mechanisms, this approach advances both scientific understanding and operational capabilities for regional groundwater management. The methodology is transferable to other groundwater-dependent regions facing similar forecasting and management challenges.

Keywords: climate change; surface water–groundwater interactions; groundwater-level forecasting; Transformer

How to cite: Chen, Z., Chen, G.-Y., Shih, M.-S., and Chang, L.-C.: Attention-Based Insights into Surface–Groundwater Coupling: Transformer Forecasting of 10-Day Groundwater Levels in the Zhuoshui River Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19031, https://doi.org/10.5194/egusphere-egu26-19031, 2026.

A.120
|
EGU26-19137
Kehan Miao, Yong Huang, Huiyang Qiu, Chao Zhuang, Le Zhang, Liming Guo, Xiaolan Hou, Ze Yang, and Thomas Hermans

Accurate prediction of contaminant transport in fractured geological systems remains a formidable challenge due to the complex spatial distribution and connectivity of fracture networks, which often induce abrupt plume front shifts and preferential pathways. Conventional predictive workflows typically rely on a two-step inversion-simulation paradigm. However, these approaches often face persistent challenges in fractured media, including computational intensity, structural underrepresentation, and inherent non-uniqueness where disparate geological configurations yield similar hydraulic responses (Ringel et al., 2024).

In this study, we propose an inversion-free georesponse mapping framework that bypasses explicit structural reconstruction by learning a direct statistical mapping from hydraulic test (HT) fingerprints to transport outcomes (Hermans et al., 2016). The framework is implemented via a dual-head multitask mixture density network (MDN). This architecture jointly predicts the contaminant plume front, represented by a signed distance field (SDF), and the latent structural features of the fracture network, encoded by a convolutional variational autoencoder (CVAE). By integrating these tasks, the shared encoder is forced to extract a geologically consistent representation of the subsurface from sparse pumping test data.

We evaluated the framework’s performance using two stochastic fracture networks. Results demonstrate that the proposed multitask MDN yields statistically reliable probabilistic forecasts and successfully identifies secondary plume branches controlled by individual fractures compared to hydraulic tomography inversion (Figure 1). This study highlights the potential of georesponse-driven deep learning as a robust and computationally efficient alternative for risk assessment and remediation management in highly heterogeneous fractured aquifers.

Figure 1. Performance of the multitask MDN for contaminant plume front prediction in two test cases. (A) Reference discrete fracture network (DFN) geometries for Test 1 and Test 2. (B) Ensembles of predicted contaminant plume fronts. The blue lines represent the prior training ensemble (5000 realizations), while the red lines represent the posterior prediction ensemble (200 realizations) generated by the dual-head multitask MDN. Yellow dots indicate the true plume front for each test case. Green markers represent the plume fronts obtained via forward solute transport simulation using the hydraulic conductivity fields inverted through hydraulic tomography. (C) Contaminant spatial arrival probability maps derived from the MDN posterior distribution.

 

References:

Hermans, T., Oware, E., & Caers, J. (2016). Direct prediction of spatially and temporally varying physical properties from time-lapse electrical resistance data. Water Resources Research, 52, 7262-7283. https://doi.org/10.1002/2016WR019126
Ringel, L. M., Illman, W. A., & Bayer, P. (2024). Recent developments, challenges, and future research directions in tomographic characterization of fractured aquifers. Journal of Hydrology, 631, 130709. https://doi.org/10.1016/j.jhydrol.2024.130709

How to cite: Miao, K., Huang, Y., Qiu, H., Zhuang, C., Zhang, L., Guo, L., Hou, X., Yang, Z., and Hermans, T.: Inversion-Free Prediction of Contaminant Plume Fronts in Fractured Media from Hydraulic Tests: A Georesponse-Driven, Dual-Head Multitask Mixture Density Network, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19137, https://doi.org/10.5194/egusphere-egu26-19137, 2026.

A.121
|
EGU26-19011
|
Virtual presentation
Antonio-Juan Collados-Lara, David Pulido-Velazquez, Leticia Baena-Ruiz, and Miguel Mejías

The combination of physically based models and artificial intelligence techniques enhances the simulation of piezometric levels by integrating the hydrological consistency of the former with the ability of the latter to capture nonlinear patterns and reduce uncertainties, ultimately providing more robust predictions. In this work, we coupled a Modflow groundwater flow model with nonlinear autoregressive neural networks with exogenous input (NARX) to simulate piezometric levels in the Campo de Montiel groundwater body (GWB).
The Campo de Montiel GWB, located in the Upper Guadiana Basin (south‑eastern Spain), represents a critical area where groundwater‑dependent ecosystems coexist in tension with intensive groundwater abstraction, mainly for irrigation. This aquifer plays a key role in the regional hydrological system and constitutes an essential water reservoir in this semi‑arid environment.
A numerical Modflow model developed by the river basin authority was used to simulate groundwater flow and river–aquifer interactions across the eight groundwater bodies of the Upper Guadiana Basin, providing hydraulic head maps and flow budgets. In a subsequent step, NARX neural networks were trained to reproduce piezometric levels using the Modflow‑simulated heads as exogenous inputs in Campo de Montiel groundwater body.
This hybrid modelling approach improved the accuracy of piezometric level simulations compared to the standalone flow model. For the pilot piezometer, the Modflow model yielded an RMSE of 8.12 m, whereas the hybrid approach reduced the RMSE to 5.08 m.

Funding: This research was partially funded by the project SIGLO-PRO (PID2021- 128021OB - I00/ AEI / https://doi.org/10.13039/501100011033/FEDER,UE), from the Spanish Ministry of Science, Innovation and Universities.

How to cite: Collados-Lara, A.-J., Pulido-Velazquez, D., Baena-Ruiz, L., and Mejías, M.: A Hybrid Physically Based–AI Framework for Improving Groundwater Level Simulations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19011, https://doi.org/10.5194/egusphere-egu26-19011, 2026.

Please check your login data.