ITS1.7/CL0.3 | Advancing Earth System Models using Machine Learning
Advancing Earth System Models using Machine Learning
Convener: Jack Atkinson | Co-conveners: Laura Mansfield, Milan Klöwer, Alex Connolly
Orals
| Tue, 05 May, 16:15–18:00 (CEST)
 
Room -2.62
Posters on site
| Attendance Tue, 05 May, 14:00–15:45 (CEST) | Display Tue, 05 May, 14:00–18:00
 
Hall X5
Orals |
Tue, 16:15
Tue, 14:00
Machine learning (ML) is being used throughout the geophysical sciences with a wide variety of applications. Advances in big data, deep learning, and other areas of artificial intelligence (AI) have opened up a number of new approaches to traditional problems.

Many fields (climate, ocean, numerical weather prediction, space weather etc.) make use of large numerical models and are now seeking to enhance these by combining them with scientific ML/AI techniques. Examples include ML emulation of computationally intensive processes, data-driven parameterisations for sub-grid processes, ML assisted calibration, and uncertainty quantification of parameters, amongst other applications.

Doing this brings a number of unique challenges, however, including but not limited to:

- enforcing physical compatibility, consistency, and conservation laws
- ensuring numerical stability,
- coupling of numerical models to ML frameworks and language interoperation,
- development and usage of differentiable models and model components,
- handling computer architectures and data transfer,
- adaptation/generalisation to different models, resolutions, or climates,
- explaining, understanding, and evaluating model performance and biases.
- quantifying uncertainties and their sources
- tuning of physical or ML parameters after coupling to numerical models (derivative-free optimisation, Bayesian optimisation, ensemble Kalman methods, etc.)

Addressing these requires knowledge of several areas and builds on advances already made in domain science, numerical simulation, machine learning, high-performance computing, data assimilation etc.

Following success over the past two years at EGU, we again solicit talks that address any topics relating to the above. Anyone working to combine machine learning techniques with numerical modelling is encouraged to participate in this session.

Orals: Tue, 5 May, 16:15–18:00 | Room -2.62

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears 15 minutes before the time block starts.
Chairpersons: Jack Atkinson, Laura Mansfield, Milan Klöwer
16:15–16:20
16:20–16:30
|
EGU26-5525
|
ECS
|
solicited
|
On-site presentation
Arthur Grundner, Tom Beucler, Julien Savre, Axel Lauer, Manuel Schlund, and Veronika Eyring

Hybrid Earth system models (ESMs) that combine physical laws with machine learning (ML) demonstrate great potential to reduce uncertainties in climate projections, particularly for subgrid processes like clouds. However, widespread adoption faces critical challenges: deep learning "black boxes" often lack interpretability and physical consistency, and coupling them with standard ESMs remains difficult due to stability issues and the need for complex re-calibration. Here, a two-step method is presented to improve a climate model with data-driven parameterizations. First, we incorporate a physically consistent cloud cover parameterization, derived from storm-resolving simulations via symbolic regression, into the ICON atmospheric climate model. We refer to this hybrid configuration, which retains the interpretability and efficiency of the traditional model, as ICON-A-MLe. Second, we address the coupling and tuning bottleneck by introducing an automated, gradient-free calibration procedure based on the Nelder-Mead algorithm. This method efficiently calibrates ICON-A-MLe without requiring differentiable physical components, making it easily extendable to other ESMs. Our results show that the tuned ICON-A-MLe substantially reduces long-standing biases. Specifically, it reduces cloud cover errors over the Southern Ocean by 75% and in subtropical stratocumulus regions by 44%. These improvements also lead to a better top-of-atmosphere radiative budget. Crucially, the model demonstrates strong generalization capabilities: it remains robust and physically consistent under significantly warmer climate scenarios. These results demonstrate that interpretable machine-learned parameterizations, paired with practical tuning, can efficiently and transparently strengthen ESM fidelity.

How to cite: Grundner, A., Beucler, T., Savre, J., Lauer, A., Schlund, M., and Eyring, V.: Reduced cloud cover errors in a hybrid AI-climate model through equation discovery and automatic tuning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5525, https://doi.org/10.5194/egusphere-egu26-5525, 2026.

16:30–16:40
|
EGU26-1118
|
ECS
|
On-site presentation
Blanka Balogh, Hugo Germain, Olivier Geoffroy, and David Saint-Martin

This study presents a data-driven parameterization of deep convection, implemented and tested within the global climate model ARP-GEM at 50 km resolution. Initially, a 'naive' neural network was used to replace ARP-GEM's traditional physical parameterization. A 30-year simulation with this data-driven approach revealed significant biases, particularly in the representation of high clouds.
To adress these biases, we developed a two-fold neural network architecture: one component responsible for detecting the triggering of the convection and another responsible for computing convective tendency terms. This refined parameterization substantially improved performance compared to the initial version. Furthermore, the enhanced parameterization was evaluated under warmer climate conditions, demonstrating online stability and consistent overall fidelity.

How to cite: Balogh, B., Germain, H., Geoffroy, O., and Saint-Martin, D.: Online test of a data-driven parameterization of deep-convection: evaluation in present and future climate, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1118, https://doi.org/10.5194/egusphere-egu26-1118, 2026.

16:40–16:50
|
EGU26-17576
|
ECS
|
On-site presentation
Yiling Ma, Nathan Luke Abraham, Stefan Versick, Roland Ruhnke, Andrea Schneidereit, Ulrike Niemeier, Felix Back, Peter Braesicke, and Peer Nowack

Atmospheric ozone is a crucial absorber of solar radiation and an important greenhouse gas. However, most climate models participating in the Coupled Model Intercomparison Project (CMIP) still lack an interactive representation of ozone due to the high computational costs of atmospheric chemistry schemes. Here, we introduce a machine learning parameterization (mloz) to interactively model daily ozone variability and trends across the troposphere and stratosphere in common CMIP simulations, including pre-industrial, abrupt-4xCO2(Ma et al. 2025), historical and future Shared Socioeconomic Pathway (SSP) scenarios simulations. We demonstrate its high fidelity on decadal timescales and its flexible use online across two different climate models -- the UK Earth System Model (UKESM) and the German ICOsahedral Nonhydrostatic (ICON) model. With meteorological variables and forcing data as inputs, mloz produces stable ozone predictions around 31 times faster than the chemistry scheme in UKESM, contributing less than 4% of the respective total climate model runtimes. In particular, we also demonstrate its transferability to different climate models without chemistry schemes by transferring the parameterization from UKESM to ICON in standard climate sensitivity simulations. This highlights mloz’s potential for widespread adoption in CMIP-level climate models that lack interactive chemistry for future climate change assessments, where ozone trends and variability will significantly modulate atmospheric feedback processes.

Reference:
Ma Y, Abraham N L, Versick S, et al. mloz: A Highly Efficient Machine Learning-Based Ozone Parameterization for Climate Sensitivity Simulations[J]. arXiv preprint arXiv:2509.20422, 2025.

How to cite: Ma, Y., Abraham, N. L., Versick, S., Ruhnke, R., Schneidereit, A., Niemeier, U., Back, F., Braesicke, P., and Nowack, P.: mloz: A Highly Efficient Machine Learning-Based Ozone Parameterization for CMIP Simulations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17576, https://doi.org/10.5194/egusphere-egu26-17576, 2026.

16:50–17:00
|
EGU26-9688
|
ECS
|
On-site presentation
Amanda Duarte, Amirpasha Mozaffari, Marina Castaño, Stefano Materia, and Miguel Castrillo Melguizo

Accurately simulating the terrestrial carbon cycle remains a major challenge in climate science, due in part to uncertainties in how slow-varying land-surface boundaries and fast-varying biophysical states are represented and coupled in Earth-system models.  We introduce a unified data-driven framework designed to generate high-resolution (1 km) historical reconstructions and future projections of Land Use (LU), Land Cover (LC), and Leaf Area Index (LAI) for real-time coupling with digital twin platforms, such as those deployed in the Destination Earth framework.

Moving beyond sequential downscaling, this framework treats the generation of boundary conditions as a cohesive multi-task learning problem. We benchmark two distinct modeling strategies: (1) Architectures trained from scratch, where we compare the performance of convolutional baselines (U-Net) against attention-based Vision Transformers (ViT) in capturing spatial heterogeneity; and (2) Foundation Model (FM) Adaptation, where we leverage state-of-the-art Earth FMs (such as  TerraMind and Prithvi) as backbones. Within this second strategy, we evaluate the trade-offs between full fine-tuning, parameter-efficient techniques using adapters, and models trained from scratch.

By integrating static geophysical features with high-frequency climate reanalysis (ERA5) and atmospheric CO2​ concentrations, the framework ensures that vegetation dynamics remain phenologically consistent with environmental forcing. We assess these approaches based on their computational efficiency, generalization across sparse data regimes, and physical consistency between categorical (LU/LC) and continuous (LAI) variables. The final output is a suite of open-source interoperable emulators designed to act as dynamic, on-demand boundary condition generators. 

 

How to cite: Duarte, A., Mozaffari, A., Castaño, M., Materia, S., and Castrillo Melguizo, M.: A Unified Data-Driven Framework for High-Resolution Land Surface Boundary Conditions, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9688, https://doi.org/10.5194/egusphere-egu26-9688, 2026.

17:00–17:10
|
EGU26-11235
|
ECS
|
On-site presentation
Brian Groenke, Maha Badri, Yunan Lin, Maximilian Gelbrecht, and Niklas Boers

Global land surface and hydrological models are crucial components of Earth System Models (ESMs). In addition to providing realistic boundary conditions for the atmosphere and ocean components, they also play a key role in understanding Earth’s changing energy imbalance and the response of the terrestrial carbon and water cycles to anthropogenic climate change. The land surface components of most ESMs typically rely on reduced-complexity parameterizations of land processes in order to efficiently resolve the transient coupling of the land surface to the atmosphere at global scales. The complexity of such models is therefore limited by the coarse spatial resolution of the atmosphere and thus they are not easily constrained by in situ and remote sensing observations of land surface parameters. As a result, offline downscaled and bias-corrected climate models and reanalysis products are often used as forcings when calibrating land surface and hydrological models at local and regional scales. We argue that this lack of online coupling in the downscaling step is one of many factors contributing to persistent biases in modern ESMs. As such, there is a need for a new generation of land models which can support more flexible coupling with the atmosphere as well as the incorporation of data-driven components. Here we present Terrarium.jl, a Julia-based land modeling framework for GPU-accelerated and automatically differentiable simulations of soil, snow, and vegetation dynamics, along with their corresponding land-atmosphere exchange fluxes. We demonstrate the value of GPU acceleration and differentiability through a series of performance benchmarks and sensitivity analyses. We further present our initial experiments in achieving stable coupling to a reduced-complexity atmosphere model, SpeedyWeather.jl, as well as a proof-of-concept for online downscaling from the scale of an intermediate-complexity ESM (~5°) to that of ERA5 (~0.25°). We discuss the main challenges encountered thus far and outline a roadmap for future development.

How to cite: Groenke, B., Badri, M., Lin, Y., Gelbrecht, M., and Boers, N.: Terrarium.jl: A framework for fully differentiable and GPU-accelerated land modeling to enable online downscaling in coarse-scale ESMs, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11235, https://doi.org/10.5194/egusphere-egu26-11235, 2026.

17:10–17:20
|
EGU26-4851
|
ECS
|
On-site presentation
Gregory Munday, Milan Klöwer, Laura Mansfield, and Maximilian Gelbrecht

In weather and climate models, momentum, heat, humidity and tracer fluxes between the Earth’s surface and atmosphere strongly depend on surface roughness. The roughness length depends on space and time-dependent surface properties over ocean, sea-ice and land. For example, surface winds impact wave height over sea-ice free oceans; vegetation and orography determine roughness length over land, where its effect on near-surface turbulence strongly impacts the surface fluxes. Here, we present a set of machine learning models trained on reanalysis data to predict surface roughness over both land and ocean grid cells in SpeedyWeather, a Julia-based climate model. More accurately representing the surface roughness has been shown to significantly improve model bias against observations over a range of variables such as surface air temperatures and near-surface wind speed. We explore the downstream impacts of using this parameterisation in the climate model, and test the generalisability of an offline-learned surface roughness scheme in future climates with reduced sea ice and land-use change. Spatial generalisation is achieved through surface roughness being a function of local variables only. We discuss efficient inference on CPU and GPU for every grid cell on each integration time-step. So-called model distillation via symbolic regression minimises the trade-off between speed versus accuracy, enabling another route to rapid inference on a grid-cell basis. Further, we investigate online learning through differentiable physics parameterisations to calibrate the learned parameterisation to surface variables from ERA5 reanalysis. We generally propose machine-learned schemes of individual climate processes towards interpretable, data-driven climate modelling. 

How to cite: Munday, G., Klöwer, M., Mansfield, L., and Gelbrecht, M.: A learned surface roughness scheme for climate prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4851, https://doi.org/10.5194/egusphere-egu26-4851, 2026.

17:20–17:30
|
EGU26-11475
|
ECS
|
On-site presentation
Maha Badri, Brian Groenke, Maximilian Gelbrecht, and Niklas Boers

Subgrid-scale (SGS) parameterizations remain a leading source of uncertainty in weather and climate models, where they represent the effects of unresolved processes occurring at scales smaller than the model’s grid resolution on the resolved fields. Similar closure problems arise in computational fluid dynamics (CFD), where turbulence models are needed to represent the impact of unresolved scales on the resolved flow. Despite recent progress toward differentiable, hybrid climate models enabled by automatic differentiation, most operational Earth system models (ESMs) remain effectively non-differentiable, limiting systematic online calibration and training.

While data-driven closures trained offline can perform well a priori, their performance often deteriorates a posteriori, once coupled to the solver because the coupled setting introduces feedbacks that are absent during offline training. In this study, we treat a controlled CFD turbulence setting as a benchmark for climate-relevant SGS learning and compare two classes of online training strategies for data-driven closures in non-differentiable models: (i) gradient-free ensemble Kalman inversion (EKI), leveraging the robustness and parallelism of ensemble-based inverse methods, and (ii) gradient-based optimization enabled by a learned differentiable emulator. For the emulator, we train a fast neural ODE surrogate of the forward model dynamics that preserves its structure and is differentiable by construction, enabling gradient-based training without modifying the original solver. We then evaluate both approaches using metrics such as accuracy, computational cost, and scalability.

How to cite: Badri, M., Groenke, B., Gelbrecht, M., and Boers, N.: Comparison of Online Training Methods for Data-Driven Subgrid-Scale Parameterizations in Non-Differentiable Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11475, https://doi.org/10.5194/egusphere-egu26-11475, 2026.

17:30–17:40
|
EGU26-351
|
ECS
|
On-site presentation
Pritthijit Nath, Sebastian Schemm, Henry Moss, Peter Haynes, Emily Shuckburgh, and Mark Webb

Sub-grid parameterisations in climate models are traditionally static and tuned offline, limiting adaptability to evolving states. This work introduces FedRAIN-Lite, a federated reinforcement learning (FedRL) framework that mirrors the spatial decomposition used in general circulation models (GCMs) by assigning agents to latitude bands, enabling local parameter learning with periodic global aggregation. Using a hierarchy of simplified energy-balance climate models, from a single-agent baseline (ebm-v1) to multi-agent ensemble (ebm-v2) and GCM-like (ebm-v3) setups, we benchmark three RL algorithms under different FedRL configurations. Results show that Deep Deterministic Policy Gradient (DDPG) consistently outperforms both static and single-agent baselines, with faster convergence and lower area-weighted RMSE in tropical and mid-latitude zones across both ebm-v2 and ebm-v3 setups. DDPG's ability to transfer across hyperparameters and low computational cost make it well-suited for geographically adaptive parameter learning. This capability offers a scalable pathway towards high-complexity GCMs and provides a prototype for physically aligned, online-learning climate models that can evolve with a changing climate. Code accessible at https://github.com/p3jitnath/climate-rl-fedRL.

How to cite: Nath, P., Schemm, S., Moss, H., Haynes, P., Shuckburgh, E., and Webb, M.: FedRAIN-Lite: Federated Reinforcement Algorithms for Improving Idealised Numerical Weather and Climate Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-351, https://doi.org/10.5194/egusphere-egu26-351, 2026.

17:40–17:50
|
EGU26-19396
|
ECS
|
On-site presentation
Filippo Quarenghi and Tom Beucler

Ensuring that deep learning models generalize across distinct regimes remains a fundamental challenge in Earth system modeling. Due to the inherent violation of the independent and identically distributed (i.i.d.) assumption, models optimized for local conditions rarely exhibit robust performance on unseen domains. While Unsupervised Domain Adaptation (UDA) is a well-established technique for mitigating such distribution shifts in computer vision, its application to Earth system modeling remains underexplored. In this study we investigate the efficacy of UDA for the super-resolution of atmospheric fields, utilizing kilometer-scale COSMO simulations [1] and the RainShift benchmark dataset [2] to quantify model robustness across different regions. We apply residual learning to jointly super-resolve precipitation and surface pressure, incorporating static predictors such as topography. To quantify transferability, we propose a systematic framework that trains on source domains and evaluates on unseen target domains, treating spatial transfer as a proxy for model robustness under distribution shifts. We introduce a consistency metric to evaluate model adaptation by comparing mean performance on seen versus unseen domains. We assess a hierarchy of adaptation methods, ranging from simple regularization to physics-informed approaches. These include domain-specific regularization and distribution alignment methods, domain adversarial training, and geometry-robust training via group-equivariant convolutions. Preliminary results on the COSMO simulations demonstrate that even elementary adaptation strategies, such as dropout and data augmentation, improve cross-domain consistency. This work establishes a controlled setup for benchmarking generalization, suggesting that UDA offers a viable pathway to bridge the gap between locally trained models and global applicability.

[1]: Cui, R., Thurnherr, I., Velasquez, P., Brennan, K. P., Leclair, M., Mazzoleni, A., et al. (2025). A European hail and lightning climatology from an 11-year kilometer-scale regional climate simulation. Journal of Geophysical Research: Atmospheres, 130, e2024JD042828. https://doi.org/10.1029/2024JD042828

[2]: Paula Harder et al. RainShift: A Benchmark for Precipitation Downscaling Across Geographies. 2025. arXiv: 2507.04930 [cs.CV]. url: https://arxiv.org/abs/2507.04930.

How to cite: Quarenghi, F. and Beucler, T.: Transferring knowledge across regions: unsupervised domain adaptation for km-scale super-resolution, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19396, https://doi.org/10.5194/egusphere-egu26-19396, 2026.

17:50–18:00
|
EGU26-18440
|
ECS
|
Highlight
|
On-site presentation
Tom Beucler, David Neelin, Hui Su, Christopher Bretherton, Will Chapman, Costa Christopoulos, Aditya Grover, Ignacio Lopez-Gomez, Tapio Schneider, Adam Subel, Oliver Watt-Meyer, and Laure Zanna

Are AI-powered climate models intrinsically more efficient than traditional climate models?

While progress is still needed before they become operational, hybrid AI-physics climate models and AI emulators of climate models have the potential to sharply reduce inference cost relative to traditional CPU-based models, allowing larger ensembles to explore different scenarios and sharpen uncertainty estimation. Yet this apparent efficiency becomes less obvious when the comparison includes GPU-ported dynamical climate models, and when efficiency is assessed against the effective complexity of the simulated climate system.

As a first step, recognizing that a perfect apple-to-apple comparison is rarely possible from reported configurations, we synthesize reported performance for leading AI climate model emulators (e.g., ACE2, CAMulator), hybrid AI-physics models (e.g., CliMA, NeuralGCM), and GPU-accelerated traditional models (e.g., SCREAM, ICON). We examine two complementary scaling views. The first compares throughput (simulated years per day) per accelerator (GPUs or TPUs) and per prognostic variable, as a function of horizontal grid spacing. The second compares the same normalized throughput against an effective complexity proxy, defined as the number of vertical levels divided by the product of the time step and the squared horizontal grid spacing, to account for the simulated vertical structure and, importantly, time-step constraints imposed by numerical stability.

We find that AI-powered models can show favorable apparent scaling with horizontal resolution in raw throughput, but that the advantage becomes modest once effective complexity is accounted for: at comparable complexity, AI climate models do not appear intrinsically more efficient than GPU-ported dynamical models. Hybrid approaches occupy a distinct middle ground: their acceleration and added value come primarily from learned parameterizations that improve the representation of unresolved processes while the overall model retains a physically-based dynamical core, including explicit conservation laws. AI climate model emulators, by contrast, offer their clearest computational advantage through task-targeted prediction, where a limited set of climate-relevant variables can be directly simulated on the grid of interest. This avoids integrating the full high-frequency, multivariate state at the short time step traditionally required for numerical stability, which is especially advantageous when emulating a fine-resolution reference model with a coarser emulator. Diverse downscaling or targeted post-processing strategies can further substitute for explicit fine-scale resolution when observations are available, enabling inexpensive local or hazard-specific risk assessment at decadal to multi-decadal time horizons.

How to cite: Beucler, T., Neelin, D., Su, H., Bretherton, C., Chapman, W., Christopoulos, C., Grover, A., Lopez-Gomez, I., Schneider, T., Subel, A., Watt-Meyer, O., and Zanna, L.: Reassessing the Scaling of AI-Powered Climate Models Against Dynamical Counterparts, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18440, https://doi.org/10.5194/egusphere-egu26-18440, 2026.

Posters on site: Tue, 5 May, 14:00–15:45 | Hall X5

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Tue, 5 May, 14:00–18:00
Chairpersons: Jack Atkinson, Milan Klöwer, Laura Mansfield
X5.135
|
EGU26-2982
|
ECS
Christian Wirths, Urs Hofmann Elizondo, Philipp Hess, and Frerk Pöppelmeier

Earth System Models of Intermediate Complexity (EMICs) are essential tools for investigating climate dynamics on millennial to orbital time scales, which are computationally prohibitive for high-resolution CMIP-class models. The computational efficiency of EMICs is primarily achieved by reduced spatial resolution of the atmosphere and ocean components. However, EMICs often couple ice-sheet and terrestrial vegetation components, which require much higher spatial resolution. The coupling of these components therefore remains a major challenge and often results in inadequate climatic forcing for these sub-modules, particularly regarding precipitation patterns. Generative machine learning, specifically diffusion models and their variants, has emerged as a powerful technique to bridge this resolution gap. Here, we present the integration of a consistency model-based approach to facilitate efficient, online downscaling of temperature and precipitation within the Bern3D EMIC with negligible computational overhead.

To achieve this, the consistency model was trained on monthly ERA5 ensemble output to learn the mapping from the coarse Bern3D grid to high-resolution fields.  This approach successfully reconstructs high-resolution spatial variability while maintaining inference speeds compatible with long model integration times, effectively avoiding additional runtime costs. This framework therefore allows for the representation of small-scale heterogeneity in surface boundary conditions which is critical for realistic ice sheet and vegetation dynamics. 

Ultimately, this approach opens new avenues to investigate complex climate-ice-vegetation feedback on orbital time scales, such as during the Last Glacial Cycle or the Mid-Pleistocene Transition.

How to cite: Wirths, C., Hofmann Elizondo, U., Hess, P., and Pöppelmeier, F.: Towards Generative Machine Learning-based Downscaling for Atmosphere-Surface Coupling in the Bern3D EMIC, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2982, https://doi.org/10.5194/egusphere-egu26-2982, 2026.

X5.136
|
EGU26-12174
|
ECS
Edward Gow-Smith and Roland Séférian

The Pelagic Interaction Scheme for Carbon and Ecosystem Studies (PISCES) is a marine biogeochemical model that is used in several IPCC-Class Earth System models. PISCES simulates the distribution of nutrients (four macronutrients and one micronutrient) that regulate the growth of two phytoplankton classes (nanophytoplankton and diatoms). It also simulates the ocean carbon cycle with a complete representation of the marine carbonate systems. PISCES includes 24 state variables, and increases the runtime of NEMO, the physical ocean model with which it is coupled, by a factor of 3.4, indicating a high computational cost.

PISCES-AI has been developed as a U-Net based machine learning PISCES emulator, which takes a small number of input variables (TOS, ZOS, SOS, PAR, atmospheric CO2), and predicts two output variables: surface chlorophyll and the difference in partial pressure of CO2 between the atmosphere and the ocean. These are the only outputs which have a direct influence on climate simulations by Earth system models. Previous work has shown the predictive power of PISCES-AI across multiple timescales, and in an out-of-domain setting.

In this work, we couple the AI emulator of PISCES to NEMO, using Eophis and Morays for Python-Fortran interaction. We evaluate its performance, as well as its computational efficiency, to give a holistic picture of the challenges and opportunites for AI emulation of ocean biogeochemistry. With a particular interest in the computational speed, we find that inference for a single time-step to be around 10ms, with a much larger preliminary bottleneck due to CPU-GPU transfer (200ms per timestep). Even with this bottleneck, with our implementation we obtain a speed-up of factor 3 compared to PISCES, and we explore ways in which the data transfer bottleneck could be reduced.

How to cite: Gow-Smith, E. and Séférian, R.: Coupling of NEMO to a neural network emulator of PISCES, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12174, https://doi.org/10.5194/egusphere-egu26-12174, 2026.

X5.137
|
EGU26-12294
|
ECS
Anna Pazola, Domna Ladopoulou, Carrow Morris-Wiltshire, Pritthijit Nath, and Alejandro Coca-Castro

Reliable high resolution precipitation fields are essential for hydrology flood risk management agriculture and climate impact assessment yet remain difficult to reconstruct from sparse and irregular rain gauge networks. Reanalysis products such as ERA5 provide physically consistent estimates but are constrained by coarse effective resolution temporal smoothing and weak local observational constraints. By formulating interpolation of spatiotemporal precipitation fields as a probabilistic context to target regression problem using neural process (NP) models, this study assesses whether NP-based approaches can outperform reanalysis and classical interpolation for local to regional rainfall reconstruction. Using high quality UK rain gauge observations combined with gridded auxiliary variables from ERA5 we implement convolutional NPs within the DeepSensor framework and compare them with a transformer based NP variant.

Models are jointly conditioned on dense meteorological fields and sparse precipitation observations and output full predictive distributions using a Bernoulli–Gamma likelihood to capture intermittency and extremes. Training is performed using random sensor masking to enforce location agnostic learning and enable zero shot prediction at unseen coordinates. Performance is evaluated against ERA5 and Kriging using identical data splits with emphasis on interpolation accuracy as well as calibration robustness to sensor sparsity. Generalisation is further assessed through few shot and zero shot transfer across regions with contrasting regimes including England, Scotland and selected GHCN domains in the US.

Using NPs, this work aims to recover sharper spatial structure with improved uncertainty calibration and higher frequency precipitation estimates relative to ERA5 under sparse observation scenarios and also evaluates their potential as robust uncertainty aware additions to physics-based models for high resolution environmental monitoring.

How to cite: Pazola, A., Ladopoulou, D., Morris-Wiltshire, C., Nath, P., and Coca-Castro, A.: Learning Spatiotemporal Precipitation Fields with Probabilistic Neural Processes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12294, https://doi.org/10.5194/egusphere-egu26-12294, 2026.

X5.138
|
EGU26-13183
|
ECS
Laura Mansfield and Hannah Christensen

Representing and quantifying uncertainty in physical parameterisations is a central challenge in weather and climate modelling, and approaches are often developed separately for different timescales. Here, we consider the separation of uncertainty by source using machine learning frameworks for subgrid-scale parameterisations. In this context, aleatoric uncertainty arises from internal variability in the training data, and epistemic uncertainty, arises from poorly constrained parameters during training. Using the Lorenz 1996 system as a testbed for simplified chaotic dynamics, we deal with uncertainties through a unified framework using Bayesian Neural Networks, to explore how the different sources of uncertainty evolve over different prediction timescales.

How to cite: Mansfield, L. and Christensen, H.: Separating Epistemic and Aleatoric Uncertainties in Weather and Climate Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13183, https://doi.org/10.5194/egusphere-egu26-13183, 2026.

X5.139
|
EGU26-12770
|
ECS
Thomas Wilder and Hongmei Li

The integration of machine learning parameterisations within climate models is paving the way for the next generation of Earth System models. Machine learning parameterisations are being developed to represent ocean and atmosphere processes such as turbulence, vertical mixing, and cloud and precipitation. These parameterisations typically require large volumes of high-resolution data for their training. This training data is often derived from the same numerical model that the parameterisation is intended for. This has the advantage that the machine learning model is only exposed to one set of numerical discretisation schemes.

Recently, global km-scale models have been introduced that simulate climate processes at remarkable detail. Explicitly resolving mesoscale and sub-mesoscale eddies and filaments enables these models to capture heat, carbon, and salt fluxes without the need for parameterisations. Global km-scale models are therefore promising training data sets for machine learning parameterisations.

In this work we intend to examine two global km-scale models that could be employed for oceanic turbulence parameterisations: NEMO ORCA36 and ICON-O. The ORCA36 model uses the tripolar grid and ICON-O uses an icosahedral grid. The question is, can either model be used to inform new ML parameterisations that can be employed in any numerical model? Therefore, a key assessment of these models will be done by exploring and contrasting their energetics, as well as the heat, salt, and carbon transports. This work will take the first step towards model-independent machine learning parameterisation development, while facilitating further cross modelling centre collaboration.

How to cite: Wilder, T. and Li, H.: Towards model-independent machine learning parameterisations of meso-scale eddies, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12770, https://doi.org/10.5194/egusphere-egu26-12770, 2026.

X5.140
|
EGU26-17216
|
ECS
Ségolène Crossouard, Masa Kageyama, Mathieu Vrac, Thomas Dubos, Soulivanh Thao, and Yann Meurdesoif

In an Atmospheric General Circulation Model (AGCM), the representation of subgrid-scale physical phenomena, also referred to as physical parameterizations, requires computational time which constrains model numerical efficiency. However, the development of emulators based on Machine Learning offers a promising alternative to traditional approaches.

We have developed offline emulators of the atmospheric component named ICOLMDZ (for DYNAMICO and LMDZ) of the IPSL climate model, in an idealized aquaplanet configuration, with the aim of emulating all the parameterizations, i.e. the LMDZ atmospheric physics component. While the results are quite promising, some fundamental questions are raised, particularly in terms of the generalization of the emulation process to meteorological conditions not seen by the emulator. This step is important for adopting the emulator as a substitute for traditional parameterizations.

This question of generalization, which relates to the ability of emulators to infer and adapt to new system states, has been studied in experiments linked to climate change. Indeed, we first investigated the performance of our emulators, trained on an aquaplanet configuration, in extrapolating the emulation process to new aquaplanets where boundary conditions are modified in order to simulate climates that are warmer and colder than the climate on which emulators are trained. The results reveal the potential of our aquaplanet emulators to reproduce the physical parameterizations of new climates. However, we also showed the limitations of these aquaplanet emulators since they encountered difficulties to generalize on a realistic configuration, i.e. when continents, topography and sea ice area are included.

This study encourages the coupling of emulators with the dynamic parts called DYNAMICO in order to better assess the relevance of the learning process, while analyzing the stability of the simulations obtained.

How to cite: Crossouard, S., Kageyama, M., Vrac, M., Dubos, T., Thao, S., and Meurdesoif, Y.: Generalization ability of emulators in reproducing the physical parameterizations of the IPSL model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17216, https://doi.org/10.5194/egusphere-egu26-17216, 2026.

X5.141
|
EGU26-18986
|
ECS
Helge Heuer, Tom Beucler, Mierk Schwabe, Julien Savre, Manuel Schlund, and Veronika Eyring

Persistent systematic errors in Earth system models (ESMs) arise from difficulties in representing the full diversity of subgrid, multiscale atmospheric convection and turbulence. Machine learning (ML) parameterizations trained on short high-resolution simulations show strong potential to reduce these errors. However, stable long-term atmospheric simulations with hybrid (physics + ML) ESMs remain difficult, as neural networks (NNs) trained offline often destabilize online runs. Training convection parameterizations directly on coarse-grained data is challenging, notably because scales cannot be cleanly separated. This issue is mitigated using data from superparameterized simulations, which provide clearer scale separation. Yet, transferring a parameterization from one ESM to another remains difficult due to distribution shifts that induce large inference errors. Here, we present a proof-of-concept where a ClimSim-trained, physics-informed NN convection parameterization is successfully transferred to ICON-A. The scheme is (a) trained on adjusted ClimSim data with subtracted radiative tendencies, and (b) integrated into ICON-A. The NN parameterization predicts its own error, enabling mixing with a conventional convection scheme when confidence is low, thus making the hybrid AI-physics model tunable with respect to observations and reanalysis through mixing parameters. This improves process understanding by constraining convective tendencies across column water vapor, lower-tropospheric stability, and geographical conditions, yielding interpretable regime behavior. In AMIP-style setups, several hybrid configurations outperform the default convection scheme (e.g., improved precipitation statistics). With additive input noise during training, both hybrid and pure-ML schemes lead to stable simulations and remain physically consistent for at least 20 years, demonstrating inter-ESM transferability and advancing long-term integrability.

How to cite: Heuer, H., Beucler, T., Schwabe, M., Savre, J., Schlund, M., and Eyring, V.: Beyond the Training Data: Confidence-Guided Mixing of Parameterizations in a Hybrid AI-Climate Model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18986, https://doi.org/10.5194/egusphere-egu26-18986, 2026.

X5.142
|
EGU26-20353
|
ECS
Bradley Stanley-Clamp, Ingmar Posner, and Hannah Christensen

Data driven parameterisations for sub-grid processes unlocks the ability to surpass the current computational constraints of Earth system models. However, machine learning (ML) can be brittle. State-of-the-art ML approaches can reliably perform on in-distribution data, exceeding human ability across a diverse range of tasks. Yet, when faced with shifts in data distribution, performance degrades. In climate modelling, when the task is predicting the state of a non-stationary system, this is evidently a substantial issue. We illustrate this with the ClimSim dataset, forming spatio-temporal groups and quantitatively show how even small shifts in distribution affect performance.

Next, we use the theory of compositional generalisation to build models which are less susceptible to these shifts in distribution. Compositional generalisation is the formation of novel combinations of observed elementary components. That is, the ability to decompose data into building blocks that are reused across both the in- and shifted-domains, such that a model can capture a domain shifted state through a set of in-domain, learnt abstractions. Inspired by these concepts we propose various architectural and regularisation changes to standard ML parameterisations to improve generalisation. Preliminary results in sub-grid process emulators suggest new insights into if and how CG can reduce model sensitivity to domain shifts.

How to cite: Stanley-Clamp, B., Posner, I., and Christensen, H.: Beyond In-Distribution Skill: Towards Robust ML Parameterisations for Non-Stationary Climate Systems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20353, https://doi.org/10.5194/egusphere-egu26-20353, 2026.

Login failed. Please check your login data.