HS2.2.7 | Learning from model differences and other good modelling practices – where are we today and where to tomorrow?
EDI
Learning from model differences and other good modelling practices – where are we today and where to tomorrow?
Convener: Diana SpielerECSECS | Co-conveners: Rosanna LaneECSECS, Anneli Guthke, Zhenyu WangECSECS, Helen Baron, Wouter KnobenECSECS
Orals
| Fri, 08 May, 14:00–15:45 (CEST)
 
Room 2.31
Posters on site
| Attendance Fri, 08 May, 16:15–18:00 (CEST) | Display Fri, 08 May, 14:00–18:00
 
Hall A
Posters virtual
| Wed, 06 May, 14:24–15:45 (CEST)
 
vPoster spot A, Wed, 06 May, 16:15–18:00 (CEST)
 
vPoster Discussion
Orals |
Fri, 14:00
Fri, 16:15
Wed, 14:24
Many papers have advised on careful consideration of the approaches and methods we choose for our hydrological modelling studies as they potentially affect our modelling results and conclusions. However, there is no common and consistently updated guidance on what good modelling practice is and how it has evolved in recent years. While many useful practices such as model benchmarking, controlled model comparisons, developing scripted workflows, carefully selecting calibration periods and methods, or testing the impact of subjective modelling decisions along the modelling chain exist, none of these can be considered common practice yet.

This session therefore intends to provide a platform for a visible and ongoing discussion on what ought to be the current standard(s) for an appropriate modelling protocol that considers uncertainty in all its facets and promotes transparency in the quest for robust and reliable results. We aim to bring together, highlight and foster work that develops, applies, or evaluates procedures for a trustworthy modelling workflow or that investigates good modelling practices for particular aspects of the modelling chain. We invite research that aims to improve the scientific basis of modelling and puts good modelling practice in focus again. This might include (but is not limited to) contributions addressing the following key questions:

1. The theoretical side of model application, centered around the question: “is my model any good?” (e.g., benchmarking, robust calibration/evaluation and controlled model comparison);
2. The practical side of model application, centered around the question: “how do I ensure my modeling work is efficient, reproducible and transparent?” (e.g., novel modelling protocols or workflows, examples of adopting the FAIR principles);
3. The social side of model application, centered around the question: “how do I communicate my model’s strengths and weaknesses?” (e.g., investigation of subjective choices along the modeling chain and communication of model outputs and uncertainties);
4. The future of model application, centered around our main question: “where are we today, and where do we want to be tomorrow?” (i.e., overviews of the current state of modeling, and visions for the future).

Orals: Fri, 8 May, 14:00–15:45 | Room 2.31

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Diana Spieler, Rosanna Lane, Anneli Guthke
14:00–14:05
14:05–14:15
|
EGU26-19380
|
solicited
|
On-site presentation
Fabrizio Fenicia, Thiago do Nascimento, and Pasquale Perrini

Over the past decades, numerous guidelines have been proposed to improve hydrological modelling practice, with much emphasis placed on model calibration, evaluation, and intercomparison. This includes our own work and that of many others. While these efforts have advanced methodological rigor, they often implicitly assume that the set of candidate models is already well defined. In practice, however, the modelling space is vast, particularly for distributed models where process representations, parameterizations, and spatial variability can differ substantially. Selecting suitable model structures therefore remains a fundamental and often underexplored challenge.

In this contribution, we argue that hydrological modelling should not be the starting point of analysis, but rather the outcome of a structured chain of reasoning. This chain begins with the data: understanding data characteristics, limitations, and information content, and interpreting them in the context of dominant hydrological processes. Such data-driven reflection naturally leads to explicit and testable model hypotheses, which then form a meaningful basis for model selection and comparison. Central to this workflow is the perceptual model, which acts as a conceptual bridge between data interpretation and formal model structures.

Using examples from recent work in distributed hydrological modelling, we illustrate how this process-oriented approach can guide the choice of model complexity and structure, reduce arbitrariness in modelling decisions, and improve the interpretability of results. The contribution emphasizes that good modelling practice requires not only robust calibration and comparison strategies, but also a transparent and data-informed pathway that precedes model application itself.

How to cite: Fenicia, F., do Nascimento, T., and Perrini, P.: Good modelling practice begins before modelling. Data, perception, and model hypotheses, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19380, https://doi.org/10.5194/egusphere-egu26-19380, 2026.

14:15–14:25
|
EGU26-15141
|
On-site presentation
Thomas Wöhling and Alexander Bartusch

The setup of hydrological models in accordance to good modelling practice guidelines involves several steps of model development that have to be performed sequentially as well as iteratively. Purpose is decisive for choices that need to be made in the modelling process. Regardless of whether the task is to develop a predictive or explanatory model, the modelling process typically involves a stage where one or several model candidates are selected, a stage of model calibration, uncertainty quantification and a phase of model evaluation and diagnostics. Ideally, at the end of the process stands a model that serves the purpose.
Explanatory models could be used to learn about (dominant) hydrological processes and thus require a certain level of process realism in the governing equations that represent the system under study. In practice, modellers form one or several hypotheses about “how the system works” and test these hypotheses by setting up corresponding model structures whose parameters are trained on data. Model choice is then a matter of model-data (mis)fit. 
However, errors in the data and model inputs, uncertainty in parameter values, misspecified or missing processes, scaling issues, among others, can and often do lead to parameter compensation in the model calibration stage. Model ensembles typically cover only a fraction of the model space, i.e. the “population” of plausible model structures. Particularly when the “true” model (or a “realistic” one) is not included, model choice boils down to model flexibility or fidelity rather than plausibility. A further complicating factor is that misspecification in combination with confidence in the data can distort uncertainty estimates of parameters and predictions, potentially leading to over-confident and biased distributions. The relative contributions of different error sources to the total uncertainty are then also affected. Unfortunately, most of this goes unnoticed, even when following good modelling practice guidelines.
In this contribution we illustrate some of these potential pitfalls and bad choices in hydrological modelling with both a synthetic test case where the “true model” exists and with an ensemble of candidate hydrological models and field data from the Forellenbach catchment in the Bavarian National Park. We highlight and demonstrate the impact of misspecified priors, biased data and uncertain model forcings on model choice and briefly discuss model fidelity vs. plausibility. Bayesian analysis is applied for model diagnosis and to disentangle error sources and their relative contribution to total uncertainty. Most of these issues, either separately or combined, have been described in modelling studies before. We like to raise awareness and encourage further discussion in the hydrological community on suitable and practical solutions to identify and treat major uncertainty sources in hydrological modelling. 

How to cite: Wöhling, T. and Bartusch, A.: Disentangling pitfalls and (bad) choices in the hydrological modelling process and their impact on model performance, uncertainty and model choice, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15141, https://doi.org/10.5194/egusphere-egu26-15141, 2026.

14:25–14:35
|
EGU26-1127
|
ECS
|
On-site presentation
Franziska Clerc-Schwarzenbach, Paul C. Astagneau, Eduardo Muñoz Castro, Ilja van Meerveld, Jan Seibert, and Vazken Andréassian

In bucket-type hydrological models (also known as conceptual models), the paths of the water in a catchment are represented in a simplified way. In general, there is one way for water to enter – via precipitation – and two ways to leave – streamflow and evaporation. However, as the simple water balance equation P=Q+E this concept is based on is often not fulfilled, many bucket-type models include ‘sweep parameters’, parameters that represent an additional way for water to enter or leave the catchment. Sweep parameters come as correction factors that are used to align inputs and outputs, but also in more sophisticated ways, such as representations of groundwater inflows or outflows. The X2 parameter (Intercatchment Groundwater Flow parameter) in the GR4J model is a well-known example of a sweep parameter.

Compared to a model in which the water balance is enforced, a model that includes a sweep parameter is usually more successful in simulating streamflow volumes: Too much or too little water can be compensated thanks to the sweep parameter, while otherwise the only options for compensation are via evaporation or large simulated storage volumes.

Because including a sweep parameter improves model performance, sweep parameters are often seen as ‘cheat parameters’. This accusation is understandable, since sweep parameters can also compensate for incorrect input data. Still, there are many reasons why the use of sweep parameters should not be frowned upon. Many catchments are not closed systems along their topographic borders and sweep parameters are one way of representing this knowledge. In addition, we should avoid compensating for incorrect input data or additional water gains or losses via evaporation, a flux that is generally not included in  model calibration – and that could be considered cheating as well. If a mismatch in the basic water balance can be represented via a sweep parameter, this is arguably a reasonable and transparent way to do so.

To investigate the effects of sweep parameters, we tested the model performance and model robustness towards variations in precipitation input data for hydrological models with and without a sweep parameter. Using a large-sample approach for more than 500 catchments in France, we could not find any evidence that model robustness is affected by the use of a sweep parameter. Furthermore, we clearly illustrate that models benefit from using a sweep parameter. Based on these results, we argue that it is justifiable to decide to sweep, but also stress that the way and effect of the sweeping should be communicated transparently and interpreted with caution.

How to cite: Clerc-Schwarzenbach, F., Astagneau, P. C., Muñoz Castro, E., van Meerveld, I., Seibert, J., and Andréassian, V.: To sweep or not to sweep? Investigating controversial parameters in hydrological models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1127, https://doi.org/10.5194/egusphere-egu26-1127, 2026.

14:35–14:45
|
EGU26-12204
|
On-site presentation
Christina Anna Orieschnig, Jean-Philippe Venot, Gilles Belaud, and Sylvain Massuel

Ensuring transparency and a clear communication of uncertainties poses a ubiquitous challenge in hydrological modelling, both in the application of existing models in new domains and in the development of new models. This challenge becomes especially pressing in interdisciplinary research contexts, where expectations of models and understanding of their roles often differ. While an increasing body of work on good modelling practices in hydrology has been developed over the past years, there is no common standard yet that could help modellers address these challenges. In particular, one aspect that is rarely explicitly described in modelling studies is the effect of preliminary perspectives of different members of the modelling team and subjective choices along the modelling chain on the model’s outputs and uncertainties. 

This study takes the example of an agro-hydrological model developed in the Cambodian Mekong Delta (Southeast Asia) to explore this social side of model development and application in an international, interdisciplinary development context. The model in question was developed to explore how regional hydrological dynamics - and particularly water availability for agriculture - would change following hydro-infrastructure rehabilitation projects funded by international development agencies. The implementation of these projects can be seen against the background of the shifting hydrological dynamics in the Mekong basin, driven by climate change, hydropower construction, and land use changes. In its final version, the model allows for a relative assessment of the effects of water availability for irrigation on the agricultural productivity in the study area, taking into account different configurations of the artificial channels to be rehabilitated as well as the annual Monsoon inundations and the hydrological dynamics of the Mekong’s deltaic distributaries. 

In our case study, we strive to highlight the impact of the expectations and goals of different members of the modelling team, originating from different disciplines, as well as the subjective modelling choices and simplifications made collectively throughout the modelling process, on the final results. In particular, we also examine the process by which simplifications were implemented and the perceptual model was negotiated, against the background of the data scarcity that is characteristic of many hydrological studies carried out in the Global South. Furthermore, we reflect on how existing guidelines for good modelling practices (such as FAIR principles) have helped with the communication of uncertainties and limitations of the model, and how future guidelines could evolve to better take into account and transparently represent social dynamics within interdisciplinary modelling teams, for instance through the use of positionality statements.

How to cite: Orieschnig, C. A., Venot, J.-P., Belaud, G., and Massuel, S.: Transparent Modelling in Interdisciplinary Research: An Illustration from an Agro-Hydrological Study in the Cambodian Mekong Delta , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12204, https://doi.org/10.5194/egusphere-egu26-12204, 2026.

14:45–14:55
|
EGU26-11368
|
ECS
|
On-site presentation
Mikhail Smilovic

Hydrological models often differ substantially in their internal structure, process representations, and parameterisations, even when calibrated against similar observations. Understanding how these structural differences manifest under both environmental forcing and management-driven forcing remains a central challenge for model intercomparison. Here, we explore a transformation-based diagnostic framework grounded in mass conservation and seasonal cyclic behaviour.

Rather than interpreting models in terms of static system states, we focus on admissible mass-conserving transformations defined by the balance among inputs, outputs, and storage changes. This relation defines an admissible envelope of possible transformations, which can be interpreted as a generalised configuration space. Within this space, seasonal cycles trace characteristic trajectories shaped by climatic variability and by model-specific representations of regulation, storage, and decision rules.

To facilitate comparison, we introduce the concept of “prints” and “scans” of these trajectories: visual representations that can be overlaid across models to reveal similarities, divergences, and systematic structural differences. This extension of the "water circles" allows model behaviour to be compared in terms of geometry and organisation of admissible transformations, rather than differences in isolated states or aggregated performance metrics.

Intended as a complementary and exploratory diagnostic, the framework provides a conservation-anchored reference to understand how environmental and anthropogenic forcings are encoded across hydrological models, offering insight into structural differences that traditional intercomparison approaches may obscure.

How to cite: Smilovic, M.: Comparing Hydrological Models in Configuration and Trajectory Space, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11368, https://doi.org/10.5194/egusphere-egu26-11368, 2026.

14:55–15:05
|
EGU26-129
|
ECS
|
On-site presentation
Hanwu Zheng, Doerthe Tetzlaff, Christian Birkel, Songjun Wu, and Chris Soulsby

Due to the inherent complexity of hydrological systems, single observation data types rarely contain sufficient information for model calibration, particularly for distributed models with large parameter dimensions. Previous studies have demonstrated the ability of multi-objective calibration to constrain equifinality and promote good model practices. However, most of these efforts have focused mainly on predictions of discharge and ET, while lacking evaluations of other key components of ecohydrological systems (e.g., subsurface storages and fluxes, and ET partitioning), potentially leading to biased hydrological inferences. Although incorporating multiple objectives provides additional constraints on the modelling, degradations in model performance often occur due to trade-offs among observations. Evaluations of these trade-offs are necessary for robust modelling inferences. Therefore, we applied multiple calibration schemes combing discharge, isotope, spatial and temporal patterns of remote sensed ET in an ET-dominated catchment (the Mid-Spree, 2800km2) of the river Spree, NE Germany, to constrain a spatially distributed tracer-aided model (STARR) over a 20-year period. Since the Spree is a major water source supplying Berlin’s drinking water, agricultural irrigation and industrial needs, ensuring trustworthy hydrological modelling, realistic process representation and careful consideration of calibration strategies is of vital importance.

Our findings show that compared to discharge-only based calibrations, additional incorporation of either isotope, temporal or spatial patterns of ET produced distinct process insights. These multi-variable calibrations revealed contrasting trade-offs, with slightly degraded discharge performance but clear improvements in the additional calibrated variables (i.e., isotope or ET patterns). Temporal patterns of ET contained similar information to discharge, and provided limited additional insights into catchment functioning. In contrast, incorporating isotopes and spatial patterns of ET in addition to discharge reduced simulated discharge volumes generated in the Mid-Spree region (>70% of discharge at the outlet of the catchment originated from the upper Spree), accompanied with slower lateral flow rates in the subsurface layer, reflecting a slow water celerity of the catchment. Isotope-aided calibrations also inferred large subsurface water storage to reproduce the damped isotope variations observed in the field, and higher ET peaks (compared to other calibration schemes) during summer. Conversely, calibrations constrained by the spatial patterns of ET indicated lower subsurface water storage, compared to calibrations constrained by other variables. This study demonstrated the implications and trade-offs of using multiple observational targets in model calibration in large scale, heavily anthropogenically influenced catchments, helping to identify more reliable parameterizations and to improve process-based understanding in distributed hydrological modelling.

How to cite: Zheng, H., Tetzlaff, D., Birkel, C., Wu, S., and Soulsby, C.: How calibration targets shape good modelling practices: trade-offs among ET patterns, discharge, and isotopes in a large ET-dominated lowland catchment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-129, https://doi.org/10.5194/egusphere-egu26-129, 2026.

15:05–15:15
|
EGU26-15888
|
ECS
|
On-site presentation
Daniel Kovacek and Steven Weijs

The flow duration curve (FDC) has long been used in water resources research and practice.  We compared three approaches to FDC estimation in ungauged basins, ranging in model complexity and richness of input data.  FDCs were estimated by 1) assuming daily runoff is log-normally distributed and predicting distribution parameters from catchment descriptors, 2) ensemble averaging of nearest and most physically similar gauged neighbours, and 3) neural network rainfall runoff modelling. When evaluated on a hydrologically diverse sample of 712 catchments around British Columbia, Canada, we found the more complex neural network model provided little performance advantage over a simpler nearest-neighbour ensemble approach, and inter-model ensembles yielded equal or better performance than individual components.  Models were evaluated by four performance measures to highlight different notions of dissimilarity expressed by conventional residual error based metrics versus an information measure, the Kullback-Leibler divergence.  

How to cite: Kovacek, D. and Weijs, S.: Flow duration curve prediction in ungauged basins: a model intercomparison study, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15888, https://doi.org/10.5194/egusphere-egu26-15888, 2026.

15:15–15:25
|
EGU26-11010
|
ECS
|
On-site presentation
Julie Collignan, Michael Schirmer, Frederiek Sperna Weiland, Joost Buitink, Julianna Regenauer, Jules Beersma, Massimiliano Zappa, Vazken Andreassian, and Tobias Wechsler

The Rheinblick2027 project investigates the impacts of climate change on the discharge of the Rhine River and its major tributaries as assessed by different working groups in the riparian countries of the Rhine. Initiated by the International Commission for the Hydrology of the Rhine Basin (CHR), the project builds on its predecessor Rheinblick2010 (formerly denoted as Rheinblick2050). The project’s main objectives are to compare model differences, develop hydrological scenarios through 2150, and assess the effects of climate change on key hydrological signatures over the Rhine catchment, such as annual water balance and high- and low flow situations.

The project started with a model intercomparison involving four hydrological models: wflow_sbm (Deltares, NL), LARSIM-ME (BfG, DE), PREVAH (WSL, CH), GRSD (INRAE, FR). A first round of simulations was conducted using the KNMI’23 scenarios, one of the few CMIP6 based downscaled climate scenarios available, providing a valuable opportunity to test and establish common workflows towards a well-defined hydrological simulation protocol.

Initial projections reconfirm increasing winter discharge and decreasing summer discharge, when water demand is highest. This first round of simulations represents a crucial step in preparing for the second round using a set of EURO-CORDEX CMIP6 scenarios as processed in the "DWD-Reference Ensemble" provided by the German weather service, scheduled for early 2026. In parallel, the Rheinblich2027 team has begun investigating the impacts of climate change on six selected key topics, among which groundwater recharge and extreme value statistics.

Beyond its core goals, Rheinblick2027 aims to strengthen stakeholder engagement and foster collaboration among modelling groups by regularly organising outreach activities and interactive platforms that bring together stakeholders from the Rhine catchment and scientists from various institutions. These interactions are essential to ensure that the final products align with stakeholder needs and provide actionable climate services for water resources management.

How to cite: Collignan, J., Schirmer, M., Sperna Weiland, F., Buitink, J., Regenauer, J., Beersma, J., Zappa, M., Andreassian, V., and Wechsler, T.: Rheinblick2027: a multi-model approach to generate large-scale hydrological scenarios for the Rhine River and assess climate change impacts on key hydrological signatures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11010, https://doi.org/10.5194/egusphere-egu26-11010, 2026.

15:25–15:35
|
EGU26-11880
|
ECS
|
On-site presentation
Camille Brun, Alban de Lavenne, Claire Delus, Hajar El Khalfi, Didier François, Thibault Hallouin, Frédéric Hendrickx, Shu-Chen Hsu, Céline Monteil, and Jean-Pierre Vergnes

Droughts are a growing concern for water managers, as climate change is expected to intensify both their severity and frequency. Accurately forecasting these events is crucial to mitigate their impacts. Semi-distributed hydrological modelling, by dividing the catchment into interconnected hydrological units, provides flow estimation at gauged and ungauged locations and explicitly accounts for some physical and climatic spatial variability across the catchment.

In this study, four semi-distributed models (GRSD, MORDOR-TS, PRESAGES, RAMEAU) are implemented in the Meuse River catchment at Chooz, an area characterized by contrasting geological, topographic, and meteorological conditions. The models differ in their structural assumptions, notably regarding groundwater exchanges beyond topographic boundaries and the potential use of piezometric data during calibration.

Following a joint calibration exercise, the four models provide consistent results on streamflow across the entire catchment and comparable performance at the outlet at Chooz in comparison to their lumped-model counterparts. Similar biases are observed among the models, which may reflect common limitations in their assumptions or uncertainties in flow measurements and meteorological data. The case study of the 2022 low-flow event highlights variability in simulated low flows, linked in particular to the choice of model and to the climate data used for calibration. Gauging measurements taken during low-flow periods would help strengthen these results. Future work should focus on improving the understanding and the representation of groundwater flows in semi-distributed hydrological models.

How to cite: Brun, C., de Lavenne, A., Delus, C., El Khalfi, H., François, D., Hallouin, T., Hendrickx, F., Hsu, S.-C., Monteil, C., and Vergnes, J.-P.: Towards semi-distributed modelling for low-flow simulation: comparing four hydrological models in the French Meuse catchment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11880, https://doi.org/10.5194/egusphere-egu26-11880, 2026.

15:35–15:45
|
EGU26-8788
|
ECS
|
On-site presentation
Motasem Abualqumboz, David Tarboton, and Keith Jennings

Hydrological modelling practice increasingly demands transparent, reproducible, and flexible workflows that enable systematic evaluation of model structure, process representation, and coupling strategies. This study presents a refactoring and componentization of a conceptual hydrologic model and its internal routines using the Basic Model Interface (BMI) as a practical mechanism for improving modelling practice through modularity, interoperability, and reproducibility. As a case study, an existing R implementation of the Hydrologiska Byråns Vattenbalansavdelning (HBV) model was reimplemented in Python using object-oriented design and exposed through BMI, a standardized interface widely adopted in Earth system modelling.

BMI components were developed at two complementary levels of granularity: (1) a component representing the complete HBV model, and (2) individual components representing the Snow, Soil, Response, and Routing routines. This dual-level design enables transparent reconstruction of the full model from its constituent processes while supporting controlled experimentation with alternative structural configurations, such as the inclusion or exclusion of internal routing. The BMI-enabled components were integrated within the Next Generation National Water Model (NextGen) framework, facilitating consistent execution, standardized variable exchange, and reproducible multi-model simulations. Applications include both standalone HBV simulations and multi-model mosaic formulations in which HBV components are coupled with other hydrologic and land-surface models.

The results demonstrate how interface-driven model design can improve hydrological modelling practice by enabling systematic model comparison, structural sensitivity analysis, and reusable workflows across modelling environments. More broadly, this work provides a transferable roadmap for converting Python-based hydrologic models into BMI-compliant components, supporting community efforts toward more transparent, interoperable, and reproducible hydrological modelling.

How to cite: Abualqumboz, M., Tarboton, D., and Jennings, K.: Improving Hydrological Modelling Practice through Componentization: A BMI-Based HBV Implementation within the NextGen Framework, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8788, https://doi.org/10.5194/egusphere-egu26-8788, 2026.

Posters on site: Fri, 8 May, 16:15–18:00 | Hall A

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Fri, 8 May, 14:00–18:00
Chairpersons: Wouter Knoben, Zhenyu Wang, Helen Baron
A.1
|
EGU26-5395
Christopher Skinner, Erica Thompson, Jessica Enright, Rolf Hut, Sam Illingworth, and Elizabeth Lewis

Good modelling practice is founded on both understanding the limitations and assumptions of models and the clear communication of these. This includes appreciation of the impacts of modelling choices and how things might be different if other choices were made. Often the audience for this communication are non-modellers who will need to make decisions based on the information provided by the modeller. There is a need for creative approaches and tools that can translate technical and abstract concepts into something meaningful.

Games, including tabletop roleplay games (TTRPGs), immerse players within imaginary worlds. Although they might resemble the real-world, they have differences that enable smooth gameplay, player immersion, and for narratives to advance. For example, they might have boundary limits to the explorable world, or approximations of time to focus on the most interesting elements. In this sense, numerical modellers and games developers have a shared experience when simulating ‘realities’.

Thompson (2022) introduced the concept of ‘model lands’ – strange worlds that are created by our models, which share some characteristics of the real-world but also many differences. We argue that model lands and game worlds are functionally the same but with different usefulness’s. We present the Adventures in Model Land framework, an open-source resource for numerical modellers that uses world-building methods from TTRPGs to bring model lands to life in an explorable way. Originally proposed as a fun activity, the methods are being developed into a toolkit to help modellers communicate the details, assumptions, limitations, and uses of their models with non-modellers.

How to cite: Skinner, C., Thompson, E., Enright, J., Hut, R., Illingworth, S., and Lewis, E.: Adventures in Model Land: Using tabletop roleplay games to explore the abstract worlds of numerical models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5395, https://doi.org/10.5194/egusphere-egu26-5395, 2026.

A.2
|
EGU26-8123
|
ECS
Philipp Schultze, Darri Eythorsson, Martyn Clark, and Julian Klaus

Large language Models (LLMs) are being developed and marketed at a rapid pace, and practitioners and scientists across many fields are exploring applications that deliver on the promises made by the leading Large Language Model providers. Given the advent of this new technology,  the fields of hydrology and hydrologic modeling are starting to investigate  its potential application. The idea of an AI assistant that is skilled in hydrological reasoning is exciting and timely. Despite the growing application of LLMs across earth sciences, it remains unclear if and how they can provide meaningful guidance on hydrological modelling.  

In this study, we investigate whether LLMs provide robust a priori suggestions for conceptual model structure, based on the implicit hydrological understanding captured in their training data. We addressed this aim across 14 diverse and a separate set of 26 hydrologically similar catchments in the contiguous United States using Google’s Gemini 2.5 Flash model. We translated the conceptual hydrological modeling framework FUSE (Framework for Understanding Structural Errors) into five different structured text-based prompts, differing in symbolic abstraction. Next, we tasked the LLM to recommend suitable hydrological model components for each catchment based on their geographic location. These recommendations were then evaluated against an exhaustive set of all 78 plausible  FUSE configurations.

We assessed the outcome of streamflow simulations from the recommendation of the LLM regarding KGE performance, regional consistency, and model fidelity in representing hydrological signatures. Our preliminary results indicate that LLMs can be prompted to adhere to strict modeling frameworks and provide model component recommendations that strongly adhere to the given restrictions resulting in executable model setups. Furthermore, the structure of the prompt profoundly impacts efficacy, highlighting a need for future research on prompt design. However, the model commonly did not recommend the top-performing structures and demonstrated inconsistency by recommending different model components across repeated identical prompts. This research represents a first step toward establishing benchmarks for "hydrologic understanding" in LLMs and assessing their viability in future modeling applications.

How to cite: Schultze, P., Eythorsson, D., Clark, M., and Klaus, J.: Does AI Understand Hydrology? - Investigating AI recommended conceptual hydrological model setups, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8123, https://doi.org/10.5194/egusphere-egu26-8123, 2026.

A.3
|
EGU26-11658
|
ECS
Grith Martinsen, Jonas Wied Pedersen, Maggie Henry Madsen, Cecilie Thrysøe, Lucas Dalgaard Jensen, Emma Dybro Thomassen, Michael Butts, Raphaél Payet-Burin, and Sanita Dhaubanjar

Denmark’s national river flood forecasting system employs several different hydrological models for predicting river discharge estimates. These are used by duty meteorologists at the Danish Meteorological Institute (DMI) to issue flood warnings. In this study we explore two of these models: a Long Short-Term Memory network (DK-LSTM) and a conceptual hydrological model with the software HYPE (DK-HYPE). From operational experience, we suspect that both models have structural deficiencies related to lack of topographically driven processes. We therefore apply a dual-model approach to explore the potential in processing and feeding more detailed terrain description for forecasting high river flows in Denmark.

The LSTM model is trained primarily based on the CAMELS data set for Denmark. CAMELS data sets are widely used and are becoming a standard, recognized data set for training and running data-driven models. The current CAMELS data sets contain simple statistical description of terrain features, like catchment-averaged mean, min and max values of elevation above mean sea level and terrain slope. The HYPE model is based on the concept of hydrological response units (HRUs) but the default implementation in HYPE only delineates HRUs based on soil and land use information. Experience from the development of the two national models (DK-LSTM and DK-HYPE) indicates that catchments with distinct topological characteristics can exhibit markedly different hydrological responses that are not captured by simple catchment averages of DEM properties.

We perform detailed raster-based representations of terrain indices like HAND, rDune and TWI across Denmark. We then test multiple ways of processing the indices and summarizing the distribution of values within each sub-catchment into catchment attributes that the LSTM model can use as inputs. The relative importance of various terrain indices in DK-LSTM for high-flow predictions are then evaluated, and this information is used to redesign HRU delineation in the DK-HYPE. This enables the DK-HYPE setup to calibrate hydrological processes with terrain information. Our findings show which terrain indices, and therefore which topographic properties, that provide most benefit for predictive performance in high river discharge events relevant for flood warning applications.

How to cite: Martinsen, G., Pedersen, J. W., Madsen, M. H., Thrysøe, C., Jensen, L. D., Thomassen, E. D., Butts, M., Payet-Burin, R., and Dhaubanjar, S.: A data-driven approach to evaluate the importance of terrain features for hydrological modelling in flood warnings, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11658, https://doi.org/10.5194/egusphere-egu26-11658, 2026.

A.4
|
EGU26-15343
|
ECS
Paul Coderre, Wouter Knoben, Cyril Thébault, Nicolas Vásquez, Martyn Clark, and Alain Pietroniro

Hydrological model evaluation is often performed with aggregated metrics such as the widely used Nash Sutcliffe Efficiency (NSE). The NSE is a skill score that can be interpreted as using the mean observed flow as a benchmark against which to compare model performance. However, this results in strong spatial patterns of scores that conflate model skill with flow variability, depending on how appropriate the benchmark model is for the catchment at hand. These patterns make it difficult to compare NSE scores across catchments which complicates model evaluation and comparison. This work addresses this limitation by using alternative formulations of the NSE that replace the mean observed flow term with various other benchmark simulations (called “benchmark efficiencies”, BME). BME values were calculated for an ensemble of 20 simple benchmarks, using hydrological model simulations from 960 basins in North America as a test case. The benchmarks vary from simple statistics calculated directly from the streamflow series to extremely simple models that try to capture the main outcomes of catchment behavior.

Results show that alternative benchmarks show spatial patterns of model performance that differ from those of the NSE, due to differences in how well the individual benchmarks capture flow variability in different regions. Benchmarks that effectively capture flow variability in a given catchment result in a low BME score and are a more challenging test of model performance. As such, selecting the lowest BME score in each catchment can reduce the spatial patterns in model scores by ensuring that the model is always being compared to the benchmark that best captures the flow variability of the catchment. The highest NSE scores were all found in catchments with strongly seasonal flow regimes, but the highest BME scores came from a more even distribution of flow regimes. Indeed, several catchments with strongly seasonal flow regimes had NSE scores above 0.5 with negative corresponding BME scores. This indicates that failing to use appropriate benchmarks for BME calculations in catchments with a strongly seasonal flow regime can mask the fact that the model cannot beat simple benchmarks and may provide an overly optimistic assessment of model performance. By selecting the most appropriate benchmark in each basin from a larger benchmark ensemble, the resulting spatial overview of model performance found through the BME approach is less conflated with flow variability. This results in BME values that are more strongly focused on the added value of using the model over alternative ways to predict the variable of interest. This strongly affects the conclusions one might draw about where a model is fit-for-purpose, and where improvements in model performance may be most readily achieved.    

How to cite: Coderre, P., Knoben, W., Thébault, C., Vásquez, N., Clark, M., and Pietroniro, A.: The impact of benchmark selection on spatial patterns of model evaluation metrics, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15343, https://doi.org/10.5194/egusphere-egu26-15343, 2026.

A.5
|
EGU26-14520
|
ECS
Neharika Bhattarai, Martyn Clark, Manabendra Saharia, Darri Eythorsson, Nicolas Vasquez, and Cyril Thébault

Hydrological simulations in the Indian subcontinent are impacted by substantial uncertainty contributed by the selection of model structure and parameterization. Nevertheless, most studies in the Indian subcontinent have relied on a single model structure to simulate streamflow, without evaluating how such a choice might impact simulations  in ungauged basins. In this study, we comprehensively evaluate the 277 gauged basins spanning across the diverse hydro-climatic regions of the Indian subcontinent using the Framework for Understanding Structural Errors (FUSE). FUSE allows testing different model structures within a controlled experimental framework, enabling the systematic evaluation of the impact of different model structures on streamflow simulations. For each gauged basin, we calibrate 78 FUSE structures and evaluate their performance with respect to the basic benchmarking models using the HydroBM python package and simulations from Noah-Multiparameterization Land Surface Model (Noah-MP LSM).

For regionalization, we generate ensembles of 500 parameter sets for each selected decision structure using Latin Hypercube Sampling and propagated these through FUSE to characterize predictive uncertainty in ungauged basin simulations. These simulations are subsequently used to train surrogate emulators of the performance surface response, enabling efficient transfer of parameters from gauge to ungauged basins.Results indicate strong variability in predictive skill, uncertainty, and regionalization performance across model structures, highlighting that structures identified as optimal in calibrated basins do not necessarily generalize under parameter transfer. These findings underscore the need to consider structural benchmarking, uncertainty reliability, and regionalization performance when developing hydrological modelling frameworks for data-scarce regions such as the Indian subcontinent.

How to cite: Bhattarai, N., Clark, M., Saharia, M., Eythorsson, D., Vasquez, N., and Thébault, C.:  Evaluating Hydrological Model Structures and Parameter Transfer across the Indian Subcontinent , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14520, https://doi.org/10.5194/egusphere-egu26-14520, 2026.

A.6
|
EGU26-2064
|
ECS
Shuaihong Zang, Xiuguang Wu, and Jinbin Mu

Flood simulation in small and medium-sized catchments across China is constrained by limited hydrometeorological data and pronounced hydroclimatic heterogeneity. The wflow_sbm model offers promising potential, as seamless parameter fields can be estimated from global datasets via pedotransfer functions (PTFs), enabling explicit representation of the spatial and temporal variability of catchment characteristics. However, the extent to which parameter sensitivity differs between humid and semi-humid regions with distinct runoff-generation mechanisms remains insufficiently understood. Moreover, it is unclear whether parameters derived by PTFs are directly applicable to small and medium-sized catchments or require regional adjustment.

In this study, the wflow_sbm model is applied to two representative Chinese catchments: the humid Tunxi basin and the semi-humid Chenhe basin. Distributed parameters are derived from global datasets using HydroMT model setup and preprocessing framework. We (1) systematically analyze the sensitivity of three key parameters about soil water dynamics (KsatHorFrac, InfiltCapSoil and SoilThickness) in humid and semi-humid basins, (2) assess the applicability of seamless parameter maps derived by PTFs and evaluate the necessity of regional adjustment, and (3) benchmark the performance of wflow_sbm against the well-established Xin’anjiang (XAJ) model in China, including multi-site validation to assess spatial robustness.

Results reveal clear regional differences in parameter sensitivity: KsatHorFrac and InfiltCapSoil dominate runoff responses in the humid Tunxi basin, whereas KsatHorFrac and SoilThickness exert the strongest control in the semi-humid Chenhe basin. The PTF-derived SoilThickness (~2 m) in Chenhe leads to systematic underestimation of flood volume and peaks. Reducing it to ~0.2 m substantially improves model performance and is consistent with vadose-zone depth estimates from the XAJ model, highlighting SoilThickness as a key control in semi-humid basins. The results also show that the wflow_sbm model achieves performance comparable to XAJ in both catchments, with an average NSE of 0.85 in Tunxi and generally NSE >0.7 in Chenhe. The good performance at the internal stations in Tunxi (average NSE > 0.70) further demonstrates that the parameter maps derived by PTFs are applicable and reliable for small and medium-sized basins.

Overall, wflow_sbm is applicable for flood simulation in small and medium-sized catchments in humid and semi-humid regions and is particularly advantageous in data-scarce basins. However, if its application in semi-humid regions requires appropriate adjustment of SoilThickness, which can be guided by parameter ranges inferred from the XAJ model.

How to cite: Zang, S., Wu, X., and Mu, J.: Evaluating the Applicability of wflow_sbm Model with Seamless Parameter Maps for Streamflow Simulation in Small and Medium-Sized River Basins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2064, https://doi.org/10.5194/egusphere-egu26-2064, 2026.

A.8
|
EGU26-2613
|
ECS
Oumar Jaffar, Abdessamad Hadri, El Mahdi El Khalki, Khaoula Ait Naceur, Mohamed Elmehdi Saidi, Yves Tramblay, and Abdelghani Chehbouni

The hydrological performance of precipitation products -widely used as forcing inputs in hydrological models- has been extensively evaluated in recent years. While factors such as similarity to observed precipitation, rain gauge data incorporation, spatial resolution, geographic location, climate classes, and catchment characteristics are commonly cited elements to explain differences in model outcomes, the correlation between precipitation products and observed streamflow (CC(P-product,Q)) has received little attention. This study investigates the role of this often-overlooked element in influencing the hydrological performance of precipitation datasets. Through four complementary experiments -using six precipitation inputs, eight structurally different hydrological models, and a large-sample hydrology approach- we demonstrate that CC(P-product,Q) is an important explanatory factor of model performance differences. Specifically, our results show that (i) CC(P-product,Q) is significantly related to model performance, (ii) it can be used as a proxy to identify hydrologically best-performing products, and (iii) these best-performing products are not necessarily the ones with the best physically realistic representation of precipitation over a given study area. Our findings highlight the need to investigate CC(P-product,Q) alongside the use of traditional evaluation metrics (e.g., KGE) to gain a more comprehensive evaluation of precipitation products in hydrological modeling. This study also emphasizes the importance of not overlooking the evaluation of precipitation products against ground precipitation observations, whenever such data are available, to avoid achieving high model performance at the cost of accurate precipitation representation.

How to cite: Jaffar, O., Hadri, A., El Khalki, E. M., Ait Naceur, K., Saidi, M. E., Tramblay, Y., and Chehbouni, A.: Revisiting the Hydrological Evaluation of Precipitation Products: Don’t Forget to Check Their Correlation with Streamflow, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2613, https://doi.org/10.5194/egusphere-egu26-2613, 2026.

A.9
|
EGU26-3497
Sheikh Muhammad Asad, Zhenyu Wang, and Andreas Hartmann

Hydrological models are an important tool in understanding the complex interactions of a catchment's water balance and for supporting water resource management. The widespread practice of calibrating these models is based on streamflow, which causes problems like inaccurate representation of other important fluxes and equifinality, where different parameter sets yield similar modelling results. These problems reduce the model interpretability, robustness, and propagate uncertainty in processes like regionalization.

Using 935 German catchments from the CAMELS-DE dataset, supported by groundwater level records and hydrogeological descriptors, we compared univariate (streamflow-only) and multivariate (streamflow + groundwater) calibration strategies. Several groundwater representation approaches and objective functions were tested. Correlation-based evaluation of groundwater storage outperformed bias-insensitive KGE, yielding higher median streamflow KGE values during calibration (0.75 vs. 0.71) and validation (0.64 vs. 0.60), confirming groundwater levels as reliable indicators of groundwater storage. Hydrogeological characteristics also showed a strong influence on model performance.

When the resulting parameter sets were used in the PASS regionalization framework, both models were found to reduce the equifinality. The multivariate-based regionalization performed significantly better, even with parameters that showed greater variability during the calibration phase. We also found that low-land catchments showed lower model efficiency during local calibration phase. During regionalization, we find the slow-draining porous catchments to show greater variability for the nonlinear parameter of groundwater βGW for the multivariate model in comparison to the univariate model. Overall, the approach underscores the importance of having additional constrained to improve the physical interpretability of the model and reduce the uncertainty and equifinality of produced parameter sets.

How to cite: Asad, S. M., Wang, Z., and Hartmann, A.: Multivariate calibration and regionalization of a conceptual hydrological model using streamflow and groundwater level, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3497, https://doi.org/10.5194/egusphere-egu26-3497, 2026.

A.10
|
EGU26-19091
Simon Stisen, Lars Troldborg, Maria Ondracek, and Raphael Schneider

A central element in water management in Denmark is the National Hydrological Model for Denmark (DK-model). The DK-model is a multi-purpose distributed, integrated hydrological model coupling 3D groundwater flow with root zone processes, overland flow and river routing combined with major human impacts such as groundwater abstraction

The model is applied for a range of national scale analysis and provides publicly available data for historic periods, in real time and for future projections. Applications include assessment of available water resources, effects of abstractions, nitrate transport and climate change impact assessments.

The model development and calibration is an ongoing process that seeks to improve performance across a range of model objectives and meet the requirements of endusers.

Recently, the calibration and parameterization of the DK-model has moved towards more spatially distributed parametrization schemes and new calibration targets regarding spatial patterns of evapotranspiration, drain fraction maps and irrigation volumes. This in combination with the large-scale distributed nature and high computational demand of the model system requires a pragmatic optimization approach that allows for both multi-objective and efficient optimization.

This is approached through the Pareto Archived Dynamically Dimensioned Search (PADDS) algorithm allowing a robust global parameter search effective even at a few hundred model runs. In addition, the PADDS approach enables a systematic analysis of tradeoffs between different objectives with minimal a-priori weighting of objective function groups.

In this study we specifically analyse the value of multiple objective functions by comparing optimizations based solely on conventional groundwater head observations and streamflow targets versus a more complex objective function scheme including seasonal groundwater fluctuations, evapotranspiration patterns, drain fractions and irrigation.  

This analysis, illustrates tradeoffs and equifinalities that are relevant for screening behavioral parameter sets for application of a multi-purpose model. In addition, a scheme for selecting an ensemble of parameter sets is illustrated.

How to cite: Stisen, S., Troldborg, L., Ondracek, M., and Schneider, R.: Multi-objective optimization to explore trade-offs in a multi-purpose national scale integrated hydrological model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19091, https://doi.org/10.5194/egusphere-egu26-19091, 2026.

A.11
|
EGU26-77
|
ECS
Darine Saad, Carolina Acuña Alonso, and Xana Álvarez Bermudez

Calibration and validation are fundamental steps in hydrological modeling, ensuring the accuracy and reliability of model predictions for effective management of freshwater resources. Traditional calibration approaches rely solely on streamflow, although other hydrological variables (soil moisture, evapotranspiration) have also proved useful. One of the most widely used hydrological models is the Soil and Water Assessment Tool (SWAT), which simulates spatial and temporal variations in watershed processes such as the water balance, streamflow routing, and the transport of nutrients and sediments. In this study, SWAT+, a revised version of the SWAT model, was applied to the Ulla River basin within the Galicia-Costa Hydrographic Demarcation in Spain to evaluate the performance of three calibration strategies: (1) single-variable calibration using streamflow (SC-Q), (2) single-variable calibration using evapotranspiration (SC-ET), and (3) multivariate calibration, integrating both streamflow and evapotranspiration (MC-QET). Multi-site calibration and validation were performed using the Sequential Uncertainty Fitting Algorithm (SUFI-2), with the Nash-Sutcliffe efficiency (NSE) index as the objective function and NSE ≥ 0.60 defined as the behavioral threshold. Observed streamflow data was obtained from three river gauging stations distributed along the river network (one downstream and two upstream). Ground-truth evapotranspiration (ET) data were estimated via triple collocation analysis combining three independent datasets (remote sensing-based, land surface model output, and reanalysis product). Results revealed that for streamflow, the MC-QET calibration scheme yielded the best performance at the downstream validation site (NSE = 0.82, PBIAS = -4.51), whereas SC-Q achieved superior results at the upstream stations (NSE = 0.82-0.86 and PBIAS = +6.35 – +12.72). Meanwhile, SC-ET performed the worst for streamflow overall, although model performance was still acceptable (NSE = 0.70 – 0.75). For evapotranspiration, both SC-ET (NSE = 0.89, PBIAS = +6.56) and MC-QET (NSE = 0.90, PBIAS = +6.24) clearly outperformed SC-Q (NSE = 0.66, PBIAS = +22.97). These findings suggest that while streamflow- or ET-only calibration can optimize the targeted variable, incorporating multiple hydrological variables during model calibration improves the overall representation of watershed processes and the water balance. However, the acceptable performance of the ET-only calibration highlights that this calibration scheme can till serve as a valid alternative in data-scarce regions where streamflow observations are limited or inconsistent. Furthermore, this study demonstrates the reliability of triple collocation analysis in improving ET estimates by reducing uncertainty among independent data sources. In conclusion, integrating multivariate calibration strategies in SWAT+ significantly enhances spatial transferability, ensures physically realistic model outputs, and improves overall prediction reliability. At the same time, ET-based calibration alone remains a practical and defensible option for data-limited watersheds, demonstrating the growing potential of remote sensing–driven hydrological modeling for comprehensive and resilient water resource assessment.

How to cite: Saad, D., Acuña Alonso, C., and Álvarez Bermudez, X.: Multisite and multivariate calibration of the SWAT+ model in Galicia-Costa, NW Spain, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-77, https://doi.org/10.5194/egusphere-egu26-77, 2026.

A.12
|
EGU26-11573
|
ECS
Leon Frederik De Vos, Karan Mahajan, Daniel Caviedes-Voullième, Faizal Rohmat, Muhammad Farras Adiprayoga, and Nils Rüther

Accurate two-dimensional hydrodynamic flood modeling in urban environments requires mesh resolutions that can capture complex flow patterns around buildings and infrastructure, while maintaining computational efficiency. However, generating suitable meshes for such applications is often time-consuming, mainly due to the complex layout of buildings in urban areas. This contribution presents an efficient and automated mesh generation workflow tailored for refined 2D flood modeling in complex urban areas. The approach introduces rule-based local mesh refinements around buildings, flow paths, and critical urban features, while maintaining a coarse resolution elsewhere. First, geometrical input data sets, such as building outlines or water body outlines, are preprocessed to ensure their geometric validity. The building geometry data set is then further analyzed and processed to ensure a refined, yet not excessive, mesh resolution between buildings, taking into account user-given thresholds for mesh resolution. Finally, the mesh is generated based on the processed input data using the Triangle mesher developed by Shewchuck (1996). The framework is designed to be automated yet user-controlled, enabling reproducible and scalable mesh generation for urban flood hazard assessment. Its performance is demonstrated through application to an urban test case in Majalaya, Indonesia, highlighting improvements in accuracy–efficiency trade-offs and suitability for operational flood risk modeling.

Reference:

Shewchuk, J. R.: Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator, in: Applied Computational Geometry: Towards Geometric Engineering, edited by Lin, M. C. and Manocha, D., vol. 1148 of Lecture Notes in Computer Science, pp. 203–222, Springer-Verlag, from the First ACM Workshop on Applied Computational Geometry, 1996.

How to cite: De Vos, L. F., Mahajan, K., Caviedes-Voullième, D., Rohmat, F., Adiprayoga, M. F., and Rüther, N.: Efficient and Automated Mesh Generation for Refined Flood Modeling in Complex Urban Environments, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11573, https://doi.org/10.5194/egusphere-egu26-11573, 2026.

A.13
|
EGU26-20396
Emma Robinson, Rosanna Lane, Helen Baron, and Elizabeth Cooper

The UK Hydro-MIP is a community-led model intercomparison project (MIP) for hydrological and land-surface modelling in the United Kingdom (UK). It was co-developed with the UK hydrological and land-surface modelling community to carry out coordinated modelling of historical streamflow for over 600 catchments across Great Britain. Participants followed a modelling protocol to ensure consistency while representing how models are used in practice. A variety of model types have been contributed, sampling the breadth of river flow modelling in the UK. The model ensemble has been evaluated and benchmarked using observed streamflow records and the participants and wider community were invited to contribute to initial analysis of the model ensemble through a community hackathon event. The resulting data set will be published later this year, providing a valuable resource to the wider hydrological community.

The UK Hydro-MIP provides an important insight into UK hydrological and land-surface modelling, allowing investigations of model uncertainty and highlighting potential research gaps. We present the development of the UK Hydro-MIP, from designing the protocol through to analysis of the results. We will discuss lessons learned from organising this MIP and demonstrate the value of model intercomparisons through our initial results.

How to cite: Robinson, E., Lane, R., Baron, H., and Cooper, E.: The UK Hydro-MIP: Lessons learned from a hydrological and land-surface model intercomparison project, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20396, https://doi.org/10.5194/egusphere-egu26-20396, 2026.

Posters virtual: Wed, 6 May, 14:00–18:00 | vPoster spot A

The posters scheduled for virtual presentation are given in a hybrid format for on-site presentation, followed by virtual discussions on Zoom. Attendees are asked to meet the authors during the scheduled presentation & discussion time for live video chats; onsite attendees are invited to visit the virtual poster sessions at the vPoster spots (equal to PICO spots). If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access the Zoom meeting appears just before the time block starts.
Discussion time: Wed, 6 May, 16:15–18:00
Display time: Wed, 6 May, 14:00–18:00

EGU26-22432 | Posters virtual | VPS9

Benchmarking flexible modelling framework Shyft across mainland Norway 

Olga Silantyeva, Shaochun Huang, and Chong-Yu Xu
Wed, 06 May, 14:24–14:27 (CEST)   vPoster spot A

Developing hydrological models, which are process-aware and reliably transferable across diverse environments remains a challenge. We benchmark Shyft – an open-source, fully FAIR (findable, accessible, interoperable and reusable) flexible modeling framework, across 109 catchments in mainland Norway to evaluate how model structure, forcing uncertainty and calibration objective jointly shape streamflow simulation performance. We adopt large sample hydrology perspective to probe five models “stacks”, providing alternative process choices, such as evapotranspiration (Penman-Monteith vs Priestley-Taylor), snowmelt (temperature-index vs semiphysical) and runoff response (Kirchner vs HBV tank and soil) with multiple goal functions drawn from KlingGupta Efficiency (KGE) and Nash-Sutcliffe Efficiency (NSE), with and without catchment specific precipitation correction. We use a suite of evaluation metrics targeting bias, hydrograph dynamics, low flows and interannual variability. We move beyond crude mean-flow benchmarks toward simple climatological benchmarks, providing an objective context for model skill evaluation, given the seasonal nature of Norwegian catchments.


The evaluation revealed that configurations containing temperature-index snow simulation and Kirchner runoff offer the greatest robustness and generality across all hydrological regimes. In terms of objective functions, KGEbased targets outperform NSE-based targets, with metric combining KGE and box-cox transformed KGE (KGE_bcKGE) identified as a promising generalist objective, which performs well across diverse metrics, including low-flow targeted (KGE(1/Q)) and interannual NSE. Furthermore, precipitation correction was found to be essential for improving performance in Mountain and Inland regimes, suggesting snow undercatch as a primary source of precipitation uncertainty. Among simple benchmarks, daily mean was found to be best predictor setting model expectations for future model intercomparisons in the region. Our results demonstrate the need for balance of structural adequacy, forcing uncertainty and equifinality.


This project is supported by Norwegian Research Council NFR project 336621.

How to cite: Silantyeva, O., Huang, S., and Xu, C.-Y.: Benchmarking flexible modelling framework Shyft across mainland Norway, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22432, https://doi.org/10.5194/egusphere-egu26-22432, 2026.

EGU26-5899 | Posters virtual | VPS9

High-performance task-based water balance modeling 

Octavio Castillo Reyes, Junjie Li, Ashkan Hassanzadeh, and Enric Vázquez-Suñé
Wed, 06 May, 14:27–14:30 (CEST)   vPoster spot A

Water balance modeling plays a pivotal role in sustainable water management, as it underpins the understanding of hydrological processes that govern resource distribution, ecosystem stability, and long-term environmental planning. Accurate and efficient computational tools are essential to capture the spatial and temporal dynamics of water balance, particularly in complex geological and urban environments. WaterpyBal is an innovative modeling framework specifically designed to construct spatial-temporal water balance models. It effectively integrates multiple stages of hydrological assessment-including data interpolation, evapotranspiration estimation, and infiltration computation-while accounting for soil heterogeneity and components of the urban water cycle. The tool demonstrates robust performance when applied to both synthetic and experimental datasets, providing reliable and scalable results.

In the context of the exascale era, where data-intensive environmental models demand unprecedented computational power, High-Performance Computing (HPC) frameworks are essential to ensure scalability and efficiency. To this end, WaterpyBal has been enhanced through its integration with PyCOMPSs, the Python binding of the COMPSs programming model. PyCOMPSs enables the transparent parallelization of Python applications by identifying task-level parallelism through annotated methods and dynamically constructing a task-dependency graph during runtime. This graph-driven execution model allows efficient scheduling and data management across distributed computing infrastructures such as clusters and cloud platforms.

The integration of WaterpyBal with PyCOMPSs significantly improves its computational performance, enabling the simulation of large-scale, high-resolution water balance models within feasible timeframes. This work demonstrates the potential of combining advanced hydrological modeling with state-of-the-art parallel computing frameworks to address emerging challenges in environmental modeling and resource management at scale.

How to cite: Castillo Reyes, O., Li, J., Hassanzadeh, A., and Vázquez-Suñé, E.: High-performance task-based water balance modeling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5899, https://doi.org/10.5194/egusphere-egu26-5899, 2026.

EGU26-1436 | Posters virtual | VPS9

Comparative Evaluation of the Newly Developed HIDROTURK-Phase II Hydrological Model in the North Marmara River Basin, Türkiye 

Meltem Kacikoc, Buket Mesta, Okan Fistikoglu, Huseyin Ozkaya, and Kubra Ozdemir-Calli
Wed, 06 May, 15:06–15:09 (CEST)   vPoster spot A

Abstract

Reliable hydrological modelling tools that can operate with the type and quality of data commonly available at the basin scale are essential for effective water resources planning. In recent years, the HIDROTURK model has been developed to support national hydrological assessments in Türkiye, especially in modelling tasks undertaken as part of river basin management planning processes. This study presents one of the first comprehensive evaluations of the newly updated HIDROTURK Phase II model and compares its performance with the AQUATOOL + EVALHID hydrological modelling system. The North Marmara River Basin was selected as the test region due to its complex hydrological structure and diverse sub-basin characteristics.

Hydrological simulations were carried out using long-term meteorological inputs derived from precipitation and evapotranspiration records for the period 1989–2014, enabling the examination of the models under a wide range of climatic conditions. Streamflow outputs were compared with observations at 14 calibration points, and model performance was assessed using the Nash–Sutcliffe Efficiency (NSE) and Percent Bias (PBIAS) indicators.

The results indicate that the meteorological inputs generated through HIDROTURK’s internal processing tools show a high level of agreement with data prepared using more traditional methods, and that both models produced comparable flow patterns under similar conditions. Overall, the findings demonstrate that HIDROTURK Phase II exhibits stable behavior even at this early stage of development and provides a practical and reliable alternative for hydrological simulations.

Keywords: Hydrological Modelling; Basin-Scale Simulation; Model Comparison; Model Performance Evaluation; Streamflow Calibration

ACKNOWLEDGEMENT: The authors would like to express their gratitude to the projects “Technical Assistance on Preparation of River Basin Management Plans for Six Basins (EuropeAid/140294/IH/SER/TR)” and “Development and Sustainability of the HIDROTURK Model Project” for their support. The authors also thank the Directorate General for Water Management, the State Hydraulic Works, and the General Directorate of Meteorology for providing essential data.

How to cite: Kacikoc, M., Mesta, B., Fistikoglu, O., Ozkaya, H., and Ozdemir-Calli, K.: Comparative Evaluation of the Newly Developed HIDROTURK-Phase II Hydrological Model in the North Marmara River Basin, Türkiye, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1436, https://doi.org/10.5194/egusphere-egu26-1436, 2026.

Please check your login data.