ITS1.10/BG10.6 | Machine learning and hybrid modelling for carbon cycle science, monitoring and carbon market policy
Machine learning and hybrid modelling for carbon cycle science, monitoring and carbon market policy
Convener: Carlos Rodriguez-PardoECSECS | Co-conveners: Kasia Tokarska de los Santos, Amirpasha MozaffariECSECS, Vitus BensonECSECS, Kai-Hendrik CohrsECSECS
Orals
| Thu, 07 May, 14:00–15:45 (CEST)
 
Room -2.62
Posters on site
| Attendance Thu, 07 May, 16:15–18:00 (CEST) | Display Thu, 07 May, 14:00–18:00
 
Hall X1
Orals |
Thu, 14:00
Thu, 16:15
Carbon monitoring is becoming ever more critical as climate change accelerates and society turns to carbon management strategies, ranging from carbon credits to the preservation and restoration of natural carbon sinks. Yet the success of these approaches depends on robust science: measurements must be accurate, verification must be rigorous, and promises must be grounded in evidence. Machine learning (ML) is rapidly transforming carbon cycle research, offering new opportunities to integrate diverse data streams, harness remote sensing, and connect multiple lines of evidence across scales. This session will highlight recent advances in ML applications for investigating, monitoring, and managing the carbon cycle, spanning satellite-based greenhouse gas estimation, biomass and forest monitoring, soil and peatland carbon dynamics, wetland and ecosystem restoration, and the mapping of terrestrial and oceanic carbon storage. We particularly encourage contributions that address hybrid modeling, uncertainty quantification, ecological mapping, knowledge-guided and trustworthy ML in carbon markets and policy contexts. By bringing together advances from Earth observation, process modeling, and policy-relevant applications, this session aims to explore both the promises and challenges of ML in delivering actionable insights for carbon management and climate mitigation.

Orals: Thu, 7 May, 14:00–15:45 | Room -2.62

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Amirpasha Mozaffari, Vitus Benson, Kai-Hendrik Cohrs
14:00–14:05
14:05–14:25
|
EGU26-7566
|
solicited
|
Highlight
|
On-site presentation
Christian Igel

Tree-based ecosystems play a crucial role in climate change mitigation by sequestering atmospheric CO₂. However, tree resource monitoring practices are often inconsistent, biased, and fail to account for trees outside forests, limiting the effectiveness of carbon credit systems and restoration strategies. This talk presents recent advances in large-scale tree ecosystem monitoring enabled by machine learning and remote sensing [1]. We demonstrate methods for estimating tree biomass and carbon stocks at continental and national scales based on high-resolution satellite imagery and LiDAR data using deep neural networks. Case studies include mapping 9.9 billion trees across African drylands [5], nationwide tree mapping and carbon stock estimation in Rwanda supporting efforts to achieve net-zero emissions [3], and assessing the overlooked contribution of trees outside forests in Europe [2]. We present an application of 3D point cloud deep neural networks to predicting vegetation biomass from airborne LiDAR [4]. Furthermore, we introduce an approach for predicting vertical vegetation structure from Sentinel-2 and spaceborne LiDAR (GEDI) data at 10 meter resolution, potentially providing insights into biodiversity, biomass, and human interventions [6]. These developments pave the way for accurate, high-resolution, and unbiased monitoring of tree biomass, supporting carbon cycle modelling and informing carbon market policies.

 

[1] Brandt et al. High-resolution sensors and deep learning models for tree resource monitoring. Nature Reviews Electrical Engineering, 2025

[2] Liu et al. The overlooked contribution of trees outside forests to tree cover and woody biomass across Europe. Science Advances, 2023

[3] Mugabowindekwe et al. Trees on smallholder farms and forest restoration are critical for Rwanda to achieve net zero emissions. Communications Earth & Environment , 2024

[4] Oehmcke et al. Deep point cloud regression for above-ground forest biomass estimation from airborne LiDAR. Remote Sensing of Environment, 2024

[5] Tucker et al. Towards continental scale monitoring of carbon stocks of individual trees in African dryland. Nature, 2023

[6] Zhang et al. A Vertical Vegetation Structure Model of Europe. Advances in Representation Learning for Earth Observation at EURIPS, 2025

How to cite: Igel, C.: Machine Learning and Remote Sensing for Monitoring Tree Biomass, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7566, https://doi.org/10.5194/egusphere-egu26-7566, 2026.

14:25–14:35
|
EGU26-13689
|
ECS
|
Virtual presentation
Ando Shah, Nils Lehman, Philipp Hess, Ronald C. Cohen, and John Chuang

High-resolution Greenhouse Gas (GHG) estimation is critical for verifying emissions inventories and informing climate policy. Current state-of-the-art estimates rely on "bottom-up" inventories, which are expensive to maintain, subject to reporting lags, and sensitive to inconsistent data supply chains. Conversely, "top-down" global reanalysis products, such as CarbonTracker, offer high quality but lack the spatial resolution required for actionable local policy, and high accuracy estimation of individual large polluters.

To bridge this gap, we present a deepC, a method that leverages high-resolution simulation data to inform a generative prior while assimilating diverse ground-truth observations. We learn a patch-based diffusion prior from multi-resolution simulations of regional and global carbon transport to model the joint distribution of winds, surface fluxes, column concentrations, and emissions. We then apply a Bayesian posterior formulation to guide the generation process using sparse observations from six satellite missions, ground stations, and coarse global reanalysis. To ensure consistency over large regions, we employ a novel spatio-temporal Markov blanket scheme during posterior sampling, producing carbon emissions estimates at 1km resolution.

We demonstrate the model's efficacy in CONUS and Western Europe, achieving stable emissions trajectories with low error relative to high-quality ground sensor and TCCON data. Early experiments suggest that conditioning the prior on embeddings from remote sensing foundation models significantly improves generalization to unseen domains. Furthermore, the model is robust to distribution shifts -- maintaining coherence under simulated future background CO2​ levels. Finally, our approach yields well-calibrated uncertainty quantification at high inference speeds with ensemble generation, highlighting its potential for rapid, transparent emissions stocktaking, and lag-free policymaking.

How to cite: Shah, A., Lehman, N., Hess, P., Cohen, R. C., and Chuang, J.: deepC: High-Resolution Carbon Emissions Monitoring via Spatio-Temporal Generative Data Assimilation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13689, https://doi.org/10.5194/egusphere-egu26-13689, 2026.

14:35–14:45
|
EGU26-22809
|
ECS
|
On-site presentation
Elena Fillola, Nawid Keshtmand, Jeff Clark, Matt Rigby, and Raul Santos-Rodriguez

The growing availability of satellite-based methane observations provides new opportunities to improve estimates of surface emissions. Inverse modelling frameworks commonly rely on Lagrangian Particle Dispersion Models (LPDMs) to simulate atmospheric transport and derive source–receptor relationships (“footprints”), but these approaches are computationally expensive and struggle to scale to the rapidly increasing volume of satellite data.
Previously, we introduced GATES (Graph-Neural-Network Atmospheric Transport Emulation System), a machine learning (ML) based emulator capable of reproducing LPDM footprint sensitivities three orders of magnitude faster than the underlying physics-based model, and demonstrated its application to infer methane emissions over South America. While such footprints capture the local contribution from surface fluxes, observed methane concentrations are often dominated by the background mole fraction associated with large-scale atmospheric transport entering the domain. Despite its importance, this background component has received comparatively little attention in ML-based transport emulation.
Here, we present a machine learning emulator for background methane mole fractions, designed to reproduce the contribution from outside the modelled domain to observed concentrations using meteorological and atmospheric state information. By combining this background emulator with the existing GATES footprint emulator, we construct a fully ML-driven pipeline capable of predicting total methane concentrations without requiring explicit LPDM simulations. We demonstrate that this framework reproduces key spatial and temporal characteristics of LPDM-based background estimates over South America, including seasonal structure, daily variability, and regional patterns, as well as its performance within inversions to estimate Brazil’s methane emissions.
We further assess the scalability of the approach by applying the footprint emulator to regions outside the original training domain. While the model performs well when trained and evaluated within the same region, performance degrades when applied to unseen domains with different meteorological regimes. These results indicate that atmospheric transport learning is strongly domain-specific, highlighting both the potential and the limitations of transfer learning, and underscoring the need for region-specific training data when extending the approach to global emulation.
This work demonstrates the feasibility of a fully ML-driven atmospheric transport and background modelling framework for methane inversion, offering the next steps towards computationally efficient, satellite-based emissions monitoring.

How to cite: Fillola, E., Keshtmand, N., Clark, J., Rigby, M., and Santos-Rodriguez, R.: Towards a Fully Machine Learning–Driven Methane Emissions Inference Pipeline at Global Scale, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22809, https://doi.org/10.5194/egusphere-egu26-22809, 2026.

14:45–14:55
|
EGU26-12172
|
ECS
|
On-site presentation
Zherong Wu, Qing Zhu, Flavio Lehner, Wu Sun, César Terrer, Trevor W. Cambron, Richard J. Norby, William K. Smith, Jiaming Wen, Yiqi Luo, Feng Tao, Ning Wei, John D. Albertson, Youran Fu, Peifeng Ma, Xiangzhong Luo, Joshua Fan, Carla P. Gomes, and Ying Sun

Terrestrial ecosystems have cumulatively sequestered 24% of anthropogenic carbon dioxide (CO2) emissions since 1850 and are critical for mitigating future climate change. However, current Earth System Models (ESMs) remain highly uncertain in projecting future trajectories of this carbon sink capacity, hampering our predictive understanding of climate mitigation potential and impeding effective climate and carbon management policies. This study develops a novel framework that harnesses deep-learning (DL) to constrain uncertainties of ESM-projected Gross Primary Production (GPP) and Net Ecosystem Production (NEP) through 2100. Specifically, we apply DL to characterize the “offset” between ESM-simulated output (using CMIP6 models) and best-available observational products (top-down, bottom-up). This offset is treated as unresolved processes by current ESMs that could be effectively resolved by DL, which, once trained during historical periods, can be applied to adjust CMIP6 projections of the future. We find that DL significantly reduces the inter-model spread of GPP by ~56% and NEP by ~66% across the CMIP6 ESM ensemble . Under the medium emission scenario (SSP 245), the ensemble mean for NEP in 2100 is much weaker, 2.42 ± 1.16 PgC yr⁻¹ compared to 5.52 ± 3.45 PgC yr⁻¹ in the raw CMIP6 projections, suggesting a current overestimation of future carbon sequestration capability. Interestingly, DL revealed a slower trajectory of NEP growth compared to the raw CMIP6 projection. Beyond curbing the uncertainties of CMIP6 projections, DL also captures key environmental sensitivities of carbon cycle processes such as CO2 fertilization and sensitivity to warming. These findings demonstrate the power of DL in effectively curbing ESMs projection uncertainties and suggest that relying solely on natural terrestrial carbon sinks for climate mitigation is unlikely to slow down climate warming.

How to cite: Wu, Z., Zhu, Q., Lehner, F., Sun, W., Terrer, C., Cambron, T. W., Norby, R. J., Smith, W. K., Wen, J., Luo, Y., Tao, F., Wei, N., Albertson, J. D., Fu, Y., Ma, P., Luo, X., Fan, J., Gomes, C. P., and Sun, Y.: Artificial Intelligence Reveals a Weaker CMIP6 Terrestrial Carbon Sink with Reduced Uncertainty, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12172, https://doi.org/10.5194/egusphere-egu26-12172, 2026.

14:55–15:05
|
EGU26-14534
|
ECS
|
On-site presentation
Piyu Ke, Xiaofan Gui, Stephen Sitch, Pierre Friedlingstein, Zhu Liu, and Philippe Ciais

Timely detection of climate-driven anomalies in terrestrial CO2 exchange is limited by the latency of current bottom-up and top-down flux products. Dynamic global vegetation model (DGVM) ensembles underpin the annual Global Carbon Budget, yet their reliance on forcing datasets updated on annual cycles delays the assessment of emerging extremes. Here we develop a member-wise machine-learning emulation system that reproduces monthly net biome production (NBP) from DGVM ensembles using near-real-time meteorological reanalysis and atmospheric CO2. The emulators learn each DGVM’s spatiotemporal response on a 0.5° grid, including memory effects from antecedent conditions, and can be run as an ensemble to provide both mean behaviour and spread. In strictly forward evaluation, the emulated ensemble preserves the seasonal cycle and interannual variability of global land–atmosphere CO2 exchange and captures the timing and broad spatial structure of deseasonalized anomalies. Skill is reduced in some tropical forest regions and the strongest positive and negative excursions are damped, indicating a conservative response under extremes. By replacing offline DGVM integrations with lightweight surrogates, this framework reduces product latency to approximately one month and delivers DGVM-consistent near-real-time CO2 flux estimates that can serve as operational priors for integrated carbon-cycle monitoring.

How to cite: Ke, P., Gui, X., Sitch, S., Friedlingstein, P., Liu, Z., and Ciais, P.: Machine-learning emulation of DGVM ensembles enables low-latency terrestrial CO2 flux estimates, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14534, https://doi.org/10.5194/egusphere-egu26-14534, 2026.

15:05–15:15
|
EGU26-12145
|
ECS
|
On-site presentation
Veera Vasenkari, Leif Backman, Juha Leskinen, Hannakaisa Lindqvist, Mari Pihlatie, Leena Järvi, and Liisa Kulmala

Urban vegetation mitigates carbon and provides ecosystem services. Quantifying these benefits relies on land surface models like JSBACH, but high-resolution long-term simulations are computationally heavy and too complex for practical applications. Machine learning emulators offer a computationally efficient alternative. Here, we present daily and monthly emulators for gross primary production (GPP) and net ecosystem exchange (NEE) of CO₂ for different plant functional types (PFTs) in Helsinki: deciduous and coniferous trees, lawn, and crops represented by 50/50 weight of cereal and agricultural grass. The emulators are trained on JSBACH simulations for 1991-2015 and evaluated for 2016-2024. Predictor variables are derived from daily air temperature, precipitation, and shortwave radiation.

The emulators are based on gradient boosting models with automated hyperparameter optimization. We trained separate models for each target variable and PFT. To estimate the total value of a target variable for each 50 m × 50 m pixel in Helsinki, we combined PFT specific predictions weighted by the fractional coverage of each vegetation type within the pixel.

Emulator performance was high across all plant functional types and for both carbon fluxes. The monthly emulator outperformed the daily emulator consistently, as demonstrated by higher explained variance and lower errors for both GPP and NEE. Although the monthly emulator smoothed out short-term variability, it still reproduced total annual GPP and NEE with a level of accuracy almost matching that of the daily emulator. 

The two machine learning emulators developed in this study achieved high levels of accuracy, enabling faster simulations than the original land surface model. The daily emulator provided more detailed information on how vegetation responds to different meteorological conditions. In contrast, the monthly emulator was better suited to urban planning, offering fast and reliable information on the carbon sequestration of various PFTs over extended periods, while reducing simulation time by over 95% compared to the daily emulator.

How to cite: Vasenkari, V., Backman, L., Leskinen, J., Lindqvist, H., Pihlatie, M., Järvi, L., and Kulmala, L.: A Machine-Learning Emulator of the land surface model JSBACH for High-Resolution Urban Biogenic CO2 Fluxes, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12145, https://doi.org/10.5194/egusphere-egu26-12145, 2026.

15:15–15:25
|
EGU26-19700
|
ECS
|
On-site presentation
Zavud Baghirov, Markus Reichstein, Basil Kraft, Bernhard Ahrens, Marco Körner, and Martin Jung

Process-based models (PBMs) and machine learning (ML) offer complementary strengths for representing the coupled carbon-water cycle. PBMs enforce physical principles and provide interpretable diagnostics but rely on incomplete process knowledge, many priors, and very limited use of expanding Earth observations, leading to substantial inter-model spread. ML leverages observations to uncover complex patterns and reduce reliance on assumptions, but can violate physical constraints and extrapolate poorly. Hybrid modeling combines both, uniting ML’s flexibility with PBMs’ interpretability and process consistency.

We present H2CM, a hybrid carbon-water cycle model that merges process‑informed deep learning with direct learning from observations (Baghirov et al., 2025; https://doi.org/10.5194/egusphere-2025-3123). H2CM simulates carbon fluxes—gross primary productivity (GPP), autotrophic respiration, and heterotrophic respiration—and water storages (soil moisture, groundwater, snow) and fluxes (evapotranspiration, runoff). The model is informed by carbon observations—GPP, net ecosystem exchange (NEE) from satellite- and in situ–based inversions, and fAPAR—and by water-cycle observations—evapotranspiration, runoff, terrestrial water storage, and snow. H2CM runs daily at 1° spatial resolution.

H2CM outperforms both purely data-driven approaches and state-of-the-art PBMs in reproducing seasonal NEE, particularly in wet and dry tropics, and it captures the rain‑pulse respiration response in drylands that many models miss. Its estimates of global NEE interannual variability align more closely with satellite- and in situ–based inversion products than do PBM estimates. Finally, we disentangle photosynthetic versus respiratory controls and quantify how different regions (e.g., wet vs. dry tropics) contribute to global variability in land–atmosphere carbon exchange.

How to cite: Baghirov, Z., Reichstein, M., Kraft, B., Ahrens, B., Körner, M., and Jung, M.: Estimating carbon dynamics using H2CM: a hybrid global carbon-water cycle model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19700, https://doi.org/10.5194/egusphere-egu26-19700, 2026.

15:25–15:35
|
EGU26-12017
|
ECS
|
On-site presentation
Miriam Daniela Cuba and Helaina Black

Effective, reliable, and cost-efficient soil carbon monitoring remains a critical bottleneck for the credibility of carbon farming projects. Large-scale projects are particularly problematic since soil sampling campaigns that enable monitoring are often logistically and financially challenging.  

Current carbon reporting protocols rely predominantly on monitoring supported by direct measurement of soil carbon stocks, often requiring stratified random sampling (SRS) across the project area. Although unbiased, SRS scales poorly, both logistically and financially, and quickly becomes unfeasible for large projects. Alternatives, often using Digital Soil Mapping (DSM) and remote sensing, are being used increasingly. While appearing to be more cost-effective since they generally entail collecting fewer soil samples, these alternatives increase uncertainty in reporting soil carbon, jeopardising the ability to reliably detect real change and risking trust in carbon farming projects.  

We propose a hybrid sampling-modelling alternative that integrates a cost-effective stage-sampling approach with a Bayesian areal spatial model that uses remote-sensing data to jointly optimise soil sampling costs and predictive uncertainty.  The areal spatial model is a latent Gaussian model fitted using integrated nested Laplace approximations (INLA) in a hierarchical Bayesian framework. The model uses remote-sensing covariates and in situ measurements to predict soil carbon stocks in regions not sampled during the sampling process. The result is a hybrid dataset that combines direct-measurement and model predictions with quantified uncertainty that can be used for accurate and reliable carbon monitoring or as input for other models.  

We present the results of a simulation study that quantifies the trade-offs between cost, number of samples and total uncertainty from the sampling design and the areal spatial model. We also present a case study of a 170-farm project in the United Kingdom, where we demonstrate the feasibility, cost-savings, and uncertainties of the approach. The results are compared to direct measurement, remote sensing data and DSM estimates to show that this framework offers a practical and cost-effective alternative that results in optimal uncertainties for carbon reporting.  

How to cite: Cuba, M. D. and Black, H.: A Hybrid Sampling-Modelling Approach using Direct Measurement and Remote Sensing to Optimise the Cost-Uncertainty Balance in Large Scale Carbon Monitoring and Carbon Farming Projects.  , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12017, https://doi.org/10.5194/egusphere-egu26-12017, 2026.

15:35–15:45
|
EGU26-16105
|
ECS
|
Virtual presentation
Yamini Agrawal, Shradha Deshpande, Poonam Seth Tiwari, and Hina Pande

India's brick sector produces over 350 billion bricks annually, making it a critical contributor to greenhouse gas emissions and air pollution. Despite this significance, comprehensive quantification of brick kiln carbon footprint emissions remains limited due to the absence of systematic kiln inventories. This study presents a novel approach that integrates object detection technology with Life Cycle Assessment (LCA) to quantify the carbon footprint of brick production, explicitly incorporating soil organic carbon (SOC) dynamics, a previously overlooked component in brick kiln emission accounting. YOLOv7 was used for automated detection and segmentation of brick kilns in Southwest Bengal (Haldia and Purba Medinipur) using open-source Google Earth Pro imagery. The model demonstrated robust performance with detection precision, recall, and F1-score of 0.881, 0.827, and 0.853 respectively, while instance segmentation achieved a mean IoU of 0.706 with precision 0.837, recall 0.818, and F1-score 0.827. 

The cradle-to-gate LCA reveals a total carbon footprint of 499.87 g CO₂/brick according to our methodology. SOC loss alone contributes 159.85 g CO₂/brick (32% of total emissions), establishing it as a major, previously unaccounted source. Fuel combustion (coal, biomass, agricultural residues) contributes 331.32 g CO₂/brick on average, while transportation adds 7.04 g CO₂/brick. For the 1,042 detected kilns, the estimated annual production capacity is 6.9 billion bricks, corresponding to total emissions of 3.46 Mt CO₂ under current operating conditions. This study is the first to systematically incorporate SOC-based carbon accounting into brick kiln emission assessments, substantially revising the perceived climate burden of the sector. By combining automated kiln detection with comprehensive LCA, the work provides a robust framework for environmental monitoring and supports SDG 13, 9, 11, 12, and 15 through improved emission accounting, land and resource management, and the design of regulatory instruments, carbon offset schemes, and incentives for cleaner brick production. 

How to cite: Agrawal, Y., Deshpande, S., Seth Tiwari, P., and Pande, H.: Automated Segmentation of Brick Kilns and Carbon Emission Analysis Using Deep Learning and Life Cycle Assessment , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16105, https://doi.org/10.5194/egusphere-egu26-16105, 2026.

Posters on site: Thu, 7 May, 16:15–18:00 | Hall X1

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Thu, 7 May, 14:00–18:00
X1.55
|
EGU26-10039
|
ECS
Marina Castaño, Amirpasha Mozaffari, Stefano Materia, and Amanda Duarte

Land use change is a significant source of anthropogenic carbon emissions, making it a critical yet often underrepresented component in climate projections. As next-generation Earth System Models move toward kilometer-scale resolutions to capture fine-scale land-atmosphere interactions, existing land use projections (typically provided at ≈30 km resolution) are insufficient to represent the spatial heterogeneity these models require.

Relying on coarse datasets can result in a loss of 31–54% of spatial information, introducing substantial biases in simulated terrestrial carbon sequestration and surface fluxes. To address this, we present a deep learning framework designed to downscale coarse Land-Use Harmonization 2 (LUH2) data into high-resolution 1 km mosaics covering the historical and future period from 1850 to 2100.

Our methodology employs a U-Net architecture to integrate transient anthropogenic drivers from LUH2, high-resolution environmental conditions using Köppen-Geiger climate classifications, and high-resolution population density with a suite of high-resolution static geophysical features (elevation, 2D depth-weighted soil composition, terrain characteristics). 

A key technical advancement is our distributed inference pipeline using Gaussian-weighted patch aggregation. By normalizing overlapping predictions, this approach eliminates blockiness and edge artifacts, ensuring seamless global transitions across the 1 km mosaic. Validation against the HILDA+ dataset demonstrates high fidelity, achieving a global accuracy of 94.5% and a mean Intersection over Union (mIoU) of 0.799 for primary land use classes. These results provide a continuous boundary condition that enhances the realism of carbon, water, and energy fluxes in next-generation climate simulations and digital twin infrastructures.

How to cite: Castaño, M., Mozaffari, A., Materia, S., and Duarte, A.: Global 1 km Reconstruction of Historical and Future Land Use with Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10039, https://doi.org/10.5194/egusphere-egu26-10039, 2026.

X1.56
|
EGU26-11690
|
ECS
Koen Ponse, Kai-Hendrik Cohrs, Phillip Wozny, Andrew Robert Williams, Tianyu Zhang, Erman Acar, Yoshua Bengio, Aske Plaat, Thomas Moerland, Pierre Gentine, and Gustau Camps-Valls

Robust carbon cycle science and effective carbon market governance depend on accurate monitoring, transparent modelling and credible representation of climate–economic feedbacks. Integrated Assessment Models (IAMs) such as RICE provide a long-standing framework for linking carbon emissions, climate dynamics and economic development and are widely used to inform mitigation pathways, carbon pricing and international climate policy. However, traditional IAMs rely on hand-calibrated parameters, simplified damage functions and fixed ethical assumptions, limiting their ability to integrate observational data, quantify uncertainty and support evidence-based carbon management. We build on recent advances in machine learning for climate policy and introduce RICE-N-JAX, a fully differentiable implementation of the multi-region RICE-N model (Zhang et al., 2025). RICE-N extends classical IAMs with multi-agent reinforcement learning to model strategic interactions and international climate negotiations. Our JAX-based reimplementation makes the entire climate–economic simulation fast and differentiable, including carbon emissions, climate response, production, trade, mitigation decisions and negotiation dynamics. Differentiability enables a new class of hybrid, data-driven climate–economic models. Our current research focuses on two key directions. First, we develop non-parametric hybrid damage functions in which the traditional analytical damage formulation is replaced by neural or spline-based surrogates trained on empirical and scenario data. This allows the damage–temperature relationship to be learned directly from data. Second, we perform inverse modelling of ethical and behavioural parameters, such as regional risk aversion, time preferences and mitigation bias, by calibrating the model against emissions, GDP and temperature trajectories from the Shared Socioeconomic Pathways (SSPs). This enables the recovery of latent normative assumptions embedded in scenario narratives and provides a data-informed basis for policy analysis. Finally, differentiability supports gradient-based calibration, uncertainty quantification, and sensitivity analysis of carbon price trajectories, mitigation pathways, and long-term climate impacts. We demonstrate a proof-of-concept end-to-end calibration of climate damage functions and show how parameter uncertainty propagates into future economic and emissions outcomes. By bridging process-based climate–economic theory with hybrid, knowledge-guided machine learning, RICE-N-JAX provides a foundation for fast and data-driven carbon-cycle modelling. The framework supports policy-relevant applications ranging from carbon pricing and climate clubs to carbon market design, illustrating how hybrid ML can strengthen the scientific basis of carbon management and climate mitigation.

References: Zhang, T., Williams, A. R., Wozny, P., Cohrs, K.-H., Ponse, K., Jiralerspong, M., Phade, S. R., Srinivasa, S., Li, L., Zhang, Y., Gupta, P., Acar, E., Rish, I., Bengio, Y., and Zheng, S.: AI for global climate cooperation: Modeling global climate negotiations, agreements, and long-term cooperation in RICE-N, Proceedings of the 42nd International Conference on Machine Learning (ICML 2025), 2025

How to cite: Ponse, K., Cohrs, K.-H., Wozny, P., Williams, A. R., Zhang, T., Acar, E., Bengio, Y., Plaat, A., Moerland, T., Gentine, P., and Camps-Valls, G.: Leveraging Differentiable Climate-Economy Models for Hybrid Modeling and Inverse Problems, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11690, https://doi.org/10.5194/egusphere-egu26-11690, 2026.

X1.57
|
EGU26-20977
|
ECS
Gabriele Galli, Marco Zamboni, Andrea Ricciardelli, Maria Luisa Quarta, and Marco Folegani

Carbon farming can deliver climate mitigation and improved soil health, but credible deployment requires scalable MRV that supports additionality assessment and remains operational at farm scale. We present an EO-driven pipeline that integrates heterogeneous Earth-system data with hybrid modelling (machine learning + process-based physics) to estimate crop yield trajectories, soil organic carbon (SOC) evolution, and economic viability under baseline and regenerative management. A case study illustrates how a crop system can transition toward regenerative farming, demonstrating alignment with EU carbon farming policy. Results show how integrated, data-driven approaches can support quantification of both environmental and financial outcomes, enabling credible carbon accounting and guiding targeted investment in sustainable agriculture.

Multi-sensor satellite time series provide indicators of vegetation dynamics, and management proxies relevant to practice adoption (e.g., seasonal soil cover and surface condition). SoilGrids data provide spatially detailed soil information that helps us capture how soil conditions vary across and within fields, and how sensitive each site is. Climate forcing relies on high-resolution CMCC climate projections, enabling stress-testing of productivity and SOC outcomes under plausible future conditions.

A Random Forest model learns non-linear relationships between yield, EO indicators, soil attributes, and climate predictors to generate baseline yield projections. These projections are translated into carbon input assumptions (e.g., residue returns) and coupled to a RothC-class SOC model to simulate SOC evolution under regenerative scenarios such as cover crops.

Farm-level decision metrics integrate transition costs, yield impacts, potential carbon revenues, and land value appreciation to estimate break-even time and NPV, supporting project design and investment appraisal.

How to cite: Galli, G., Zamboni, M., Ricciardelli, A., Quarta, M. L., and Folegani, M.: EO-driven carbon farming MRV: linking crop yield prediction to SOC change, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20977, https://doi.org/10.5194/egusphere-egu26-20977, 2026.

X1.58
|
EGU26-13867
|
ECS
Alecsander Mergen, Josué Sehnem, Maria Pinheiro, Débora Roberti, and Rodrigo Jacques

Quantifying carbon exchanges in natural grasslands is crucial for improving management practices, estimating carbon budgets, and supporting climate mitigation policies. However, direct measurements of net ecosystem CO₂ exchange (NEE) using flux towers are spatially limited, particularly in heterogeneous biomes such as the Brazilian Pampa. This study presents a machine learning framework to upscale carbon exchange observations based on flux towers in natural grasslands used for extensive cattle production in southern Brazil. Continuous CO₂ flux measurements were obtained from multiple flux towers installed across four ecological regions representative of the Brazilian Pampa biome, encompassing different combinations of soil types, vegetation structure, climatic conditions, and grassland management. These long-term observations capture pronounced seasonal and interannual variability in NEE, driven primarily by climate variability and grazing management. Artificial neural networks (ANNs) were trained using eddy covariance flux data, meteorological variables (solar radiation, precipitation, air temperature, and humidity) derived from reanalysis products, and vegetation indicators obtained from satellite remote sensing. The trained models were applied to estimate daily NEE in other regions of the Pampa with different edaphoclimatic and vegetation characteristics where flux towers were installed. Model performance was evaluated using independent subsets of eddy covariance observations, with accuracy assessed using standard statistical metrics for this type of model. The results demonstrate that the machine learning approach successfully reproduces observed seasonal patterns and interannual variability of carbon exchanges, enabling spatially explicit estimation of carbon uptake and emissions in natural grasslands. This framework provides a scalable tool for regional carbon accounting in natural grasslands and for deriving regional emission and uptake factors. The approach contributes to improving monitoring, reporting, and verification (MRV) of nature-based climate solutions and supports policies aimed at low-carbon livestock production and conservation of the Pampa biome.

How to cite: Mergen, A., Sehnem, J., Pinheiro, M., Roberti, D., and Jacques, R.: Integrating eddy covariance and machine learning for the spatial estimation ofcarbon exchanges in natural grasslands of the Pampa biome, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13867, https://doi.org/10.5194/egusphere-egu26-13867, 2026.

X1.59
|
EGU26-16064
Chien-I Lee and Hao-Che Ho

The precise Measurement, Reporting, and Verification (MRV) of carbon stocks in small-scale afforestation and restoration forests can be served as the  foundation for subsequent carbon sink monitoring and benefit assessment.  Satellite remote sensing method, on the other hand, often  faces insufficient spatial resolution comparing to Unmanned Aerial Vehicle (UAV) imagery. UAV can capture fine details, but often results in "scale mismatch" and systematic estimation bias due to canopy shadows, background soil noise, and spectral saturation effects while applying estimation models directly. To address this technical bottleneck, this study aims to establish an automated carbon stock estimation workflow based on UAV multispectral imagery and to optimize estimation accuracy by identifying the optimal observational resolution through multi-scale analysis. 

The research methodology synchronizes field surveys with remote sensing modeling. First, a comprehensive tree-by-tree biomass inventory was conducted in sample plots. Allometric equations were used to calculate stand biomass, which was then converted into measured carbon stock to serve as Ground Truth for model validation. Subsequently, UAV multispectral images were acquired to calculate vegetation indices (e.g., NDVI) and establish regression models between spectral features and carbon stock. Furthermore, image resampling techniques were adopted to simulate multi-level spatial resolutions ranging from 0.03 to 5 m, systematically analyzing the impact of resolution changes on the Root Mean Square Error (RMSE) and the coefficient of determination (R²). This study clarifies the interference mechanism of spatial scale on canopy spectral signals and identifies the optimal aggregation scale to mitigate background noise. Ultimately, this research provides practical prediction formulas and a Standard Operating Procedure (SOP). In the future, applying this model to UAV-acquired imagery in similar restoration forests will enable rapid, automated carbon estimation without the need for time-consuming field surveys, significantly enhancing the efficiency and economic viability of carbon asset inventories.

Keywords

Aboveground Biomass (AGB), Multispectral UAV, NDVI, Allometric Biomass Model, Scale Effect, Restoration Forest, Carbon Sink Estimation

How to cite: Lee, C.-I. and Ho, H.-C.: Optimizing Aboveground Carbon Stock Estimation in Restoration Forests: A Multi-Scale Analysis of UAV Multispectral Imagery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16064, https://doi.org/10.5194/egusphere-egu26-16064, 2026.

X1.60
|
EGU26-16682
|
ECS
Huntley Brownell, Stefan Oehmcke, Thomas Nord-Larsen, and Christian Igel

Abstract
More accurate local estimates of biomass and other forest attributes translate
into more accurate national-level estimates, improving forest monitoring and
informing forest policy. Higher-resolution local estimates facilitate more precise
monitoring of forest growth and harvest, allowing for better forest management
planning, and can also be used for verification of forest carbon storage, such as
for tree-based carbon credit programs and afforestation projects.


We present the first time series of high-resolution national maps of tree biomass,
carbon, volume, canopy height, and basal area produced using deep learning
methods applied to 3D point cloud LiDAR data. With hexagonal tiles of a 30
m diameter, the maps enable direct observation of stock change of aboveground
biomass, carbon, and other forest attributes at high resolution, in contrast to
inventory based estimates or coarser resolution remote sensing-based products.
We verify that our approach provides reliable estimates at the national and local
scales by comparing it to additional ground truth plot data from a time series
of local inventories.


The model was trained and validated on ground-truth data from the Danish Na-
tional Forest Inventory (DNFI) by combining field measurements aligned with
more than 20,000 sample plots extracted from two complete national LiDAR
scans. Based on [1], we apply a 3D convolutional neural network (CNN) using
the SENet50 architecture. We extended the approach to perform quantile re-
gression for uncertainty quantification. Our best model achieves an R2 of 0.83
for biomass and carbon, 0.84 for volume, 0.91 for canopy height, and 0.78 for
basal area on validation data.


We find that our model outperforms other state-of-the-art methods, which are
either based on passive 2D imagery or depend on using point cloud data indi-
rectly by extracting summary statistics. By using active LiDAR, we can derive
information from beneath tree canopies, and using the full point cloud enables
the model to learn from detailed information on forest structure, which may be
a key advantage.


The high resolution and accuracy of our method offer unprecedented potential
for time series analysis. The model is sensitive to changes at the individual tree
level, allowing for the detection of individual tree removals or growth. While
large scale forest cover change is easily detected with aerial imagery, thinnings
or partial removals are more difficult to uncover with most methods; however,
our analysis of independent repeated local inventory plots shows that our model
successfully detects smaller scale thinnings and tree growth.


References
[1] Stefan Oehmcke et al. “Deep point cloud regression for above-ground forest
biomass estimation from airborne LiDAR”. In: Remote Sensing of Environ-
ment 302 (2024).

 

How to cite: Brownell, H., Oehmcke, S., Nord-Larsen, T., and Igel, C.: Time series of national biomass maps from deep learning applied to airborne laser scanning point cloud data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16682, https://doi.org/10.5194/egusphere-egu26-16682, 2026.

X1.61
|
EGU26-21697
|
ECS
Xiaobin Guan, Yongming Ma, Chao Zeng, and Liupeng Lin

Accurate estimation of global gross primary productivity (GPP) is fundamental for understanding terrestrial carbon cycling. Eddy covariance (EC) flux observations provide reliable site-scale GPP estimates, but the spatially sparse distribution limits their applicability at large scales. Satellite-based solar-induced chlorophyll fluorescence (SIF) has emerged as a promising proxy for large-scale GPP estimation; however, current satellite SIF observations also suffer from limited spatiotemporal coverage, and uncertainties remain in the SIF–GPP conversion. Moreover, conventional machine learning models trained solely on EC observations often exhibit limited spatial generalization due to the scarcity of spatially representative training samples.

To address these challenges, this study proposes a satellite–ground jointly constrained framework that integrates EC flux measurements and satellite SIF observations using transfer learning and multi-task learning techniques to exploit the complementary strengths of both data sources for global GPP estimation. First, for TROPOMI SIF data that has global spatial coverage but short temporal records, SIF is treated as a source domain to pre-train the model, which is then fine-tuned using long-term EC-derived GPP data as a target domain. This transfer learning-based model (SIFTML) demonstrates improved spatial generalization compared to models trained solely on SIF or EC data, effectively reducing systematic underestimation and overestimation at high and low GPP levels, respectively, while remaining insensitive to the magnitude scaling of source-domain SIF inputs.

Second, for the spatially sparse and track-like distributed OCO-2 SIF observations, a multi-task learning framework based on a mixture-of-experts architecture is developed. A physically constrained loss function derived from the SIF–GPP relationship is introduced to simultaneously achieve seamless SIF reconstruction and high-accuracy GPP estimation by jointly leveraging SIF and EC constraints. Results indicate that the multi-task model outperforms traditional single-task approaches in both GPP estimation and SIF reconstruction.

Overall, this study provides a new paradigm for long-term, high-accuracy global GPP estimation by alleviating limitations associated with the spatiotemporal coverage of ground EC and satellite SIF observations, as well as the uncertainties in SIF–GPP conversion, thereby offering improved support for global carbon cycle research.

How to cite: Guan, X., Ma, Y., Zeng, C., and Lin, L.: Satellite SIF and Ground EC Observation Jointly Constrained Estimation of Global Gross Primary Productivity, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21697, https://doi.org/10.5194/egusphere-egu26-21697, 2026.

X1.62
|
EGU26-2837
|
ECS
Tenaw Workie, Martin Brandt, Philippe Ciais, Max Gaber, Petri Pellikka, and Alemu Gonsamo

Land degradation, deforestation and climate change have exacerbated droughts in Ethiopia, severely threatening its agriculture dependent economy. This led to large-scale restoration initiatives such as Sustainable Land Management Program (SLMP), Reduction of Emission from Deforestation and Forest Degradation Plus (REDD+) and the Green Legacy Initiative (GLI). GLI reported planting 32 billion trees since 2019, yet evidence remains limited. Here, we developed a deep learning framework robust to geolocation errors to monitor nationwide canopy height dynamic at 10m resolution to conduct intervention specific outcome assessments. We found a net gain of 23,537 km² in tree cover with trees above 8m height over the period 2019-2024. The large gain in young trees offsetting loss of tall trees is attributed to recent tree planting initiatives such as the GLI, REDD+, SLMP and expansion of commercial plantation by the small landholder farmers. SLMP and REDD+ interventions yielded the largest mean canopy height gains albeit in smaller areas.  Our results demonstrate measurable evidence that large-scale restoration interventions in Ethiopia are reversing the long-standing deforestation trends in the country.

How to cite: Workie, T., Brandt, M., Ciais, P., Gaber, M., Pellikka, P., and Gonsamo, A.: Satellite observations reveal large-scale restoration interventions reversing deforestation in Ethiopia, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2837, https://doi.org/10.5194/egusphere-egu26-2837, 2026.

X1.63
|
EGU26-13805
|
ECS
Marine Mercier, Andrea Marinoni, and Sakthy Selvakumaran

Robust carbon monitoring is fundamental to the credibility of climate mitigation strategies, including carbon markets, nature-based solutions, and ecosystem restoration initiatives. Soil organic carbon (SOC), as a major and dynamic component of the carbon cycle, is traditionally quantified through soil sampling and laboratory analyses. Although accurate at local scales, these methods are costly, time-consuming, and spatially sparse, limiting their suitability for large-scale monitoring, underscoring the need for scalable and robust alternatives.

Recent advances in machine learning (ML), and particularly deep learning (DL), offer substantial potential to integrate heterogeneous data streams and reinforce the scientific basis of carbon accounting. However, the application of DL to soil carbon studies remains limited, with most existing work confined to small spatial domains and relatively modest datasets. This limitation reflects the intrinsic complexity of environmental systems, the scarcity of high-quality reference observations, and persistent challenges in multimodal data integration and model interpretability.

Using the pan-European Land Use/Cover Area Frame Survey (LUCAS) soil dataset, this study presents a multimodal deep learning framework for large-scale prediction of SOC stocks. In addition to SOC, the framework estimates texture-related proxies and ancillary soil attributes relevant to carbon stock assessment. The approach integrates a comprehensive suite of data sources, including multispectral Sentinel-2 imagery, climate time series variables, and land-cover information, to jointly exploit spectral and spatio-temporal dependencies.

The proposed architecture integrates modality-specific components tailored to each data type, enabling a coherent spatio-temporal representation of SOC dynamics. Convolutional neural networks (CNNs) are used to extract spatial patterns and vegetation–soil spectral signatures from multispectral imagery, while recurrent architectures, including long short-term memory (LSTM) networks, encode seasonal to interannual variability driven by climatic conditions. Multiple deep learning encoders are systematically compared, ranging from conventional CNN–LSTM architectures to state-of-the-art transformer and vision transformer models, in order to assess their ability to capture long-range dependencies, cross-modal interactions, and complex non-linear relationships underlying SOC distribution.

A comparative analysis further benchmarks the proposed deep learning framework against widely used machine learning methods in soil science, including Random Forest (RF), Extreme Gradient Boosting (XGB), and Multiple Linear Regression (MLR). Model performance is assessed not only in terms of predictive accuracy, but also with respect to implementation complexity and interpretability, highlighting practical trade-offs for operational deployment.

By integrating heterogeneous data sources, this work demonstrates how artificial intelligence can bridge the gap between point-based field measurements and policy-relevant carbon assessments, while supporting state-of-the-art monitoring, reporting, and verification (MRV) frameworks. This analysis contributes to ongoing efforts to develop transparent, scalable, and evidence-based carbon monitoring tools, while explicitly highlighting persistent challenges related to data bias, spatial transferability, and model interpretability.

How to cite: Mercier, M., Marinoni, A., and Selvakumaran, S.: Multimodal Machine and Deep Learning Frameworks for Soil Organic Carbon Monitoring , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13805, https://doi.org/10.5194/egusphere-egu26-13805, 2026.

X1.64
|
EGU26-15228
|
ECS
Jonathan Groß, Vitus Benson, Maurício Lima, Alexander Winkler, and Christian Reimers

Accurate estimates of the spatiotemporal distribution of atmospheric carbon dioxide (CO2) are essential to evaluate and enforce international climate agreements as well as to infer fluxes of the greenhouse gas. However, current observations are spatially sparse, with satellite and in-situ measurements providing only partial coverage of the Earth’s surface and atmosphere. Atmospheric transport models are often used to infer CO2 concentrations across unobserved regions by simulating how gases move and mix in the atmosphere. While physically grounded, these models are computationally intensive and notoriously difficult to calibrate with observational data, due to the complexity of atmospheric dynamics and the sparsity of available measurements.

This study investigates the use of generative machine learning for inpainting of CO2. More specifically, we apply flow matching, an approach that generates samples from an unknown target distribution by iteratively transforming samples from a simple known noise distribution with a deep neural network. In a first step, we train a flow matching model on assimilation data from CarbonTracker (CT2022). This trains the model to respect the physical patterns of atmospheric CO2 fields, turning it into an effective prior for data assimilation. In a second step, we test the trained flow matching model on conditional generation that is, reconstruction of atmospheric CO2 from partial observations. For this, we artificially mask parts of the CT2022’s CO2 in a way that emulates the availability of satellite measurements. In a third step, we infer global CO2 by conditioning on the total column average CO2 (XCO2) measurements from NASA’s Orbiting Carbon Observatory-2 (OCO-2), comparable to other inversions from the OCO-2 v11 MIP, but using a novel approach.

Extensive evaluation against independent and held-out test-sets from in-situ and satellite measurements show physical consistency and decent agreement of the reconstructed global CO2 fields from OCO-2 measurements. However, challenges remain: specifically, future research needs to alleviate spurious artifacts from the employed posterior conditioning method in both the artificial mask and particularly the conditioning on XCO2 before the approach can become operational.

Our presented flow matching approach opens up new avenues of research. The prior parameterized by the flow matching model can be investigated itself. For instance, it is possible to perform feature extraction inside the latent space and hence purposefully explore counterfactual scenarios of CO2 distributions by carefully tracing out paths in the noise distribution and analyzing the corresponding generated CO2 samples.

How to cite: Groß, J., Benson, V., Lima, M., Winkler, A., and Reimers, C.: Reconstructing Atmospheric CO2 with Flow Matching Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15228, https://doi.org/10.5194/egusphere-egu26-15228, 2026.

X1.65
|
EGU26-22441
|
ECS
Samy Hashim, Sayan Mandal, Rocco Sedona, Ehsan Zandi, and Gabriele Cavallaro and the 3D-ABC Team

The 3D-ABC project, developed within the Helmholtz Foundation Model Initiative, aims to create a foundation model for accurate mapping of global terrestrial above- and below-ground carbon stocks in vegetation and soils at high spatial resolution. The model integrates multimodal remote sensing data including Harmonized Landsat-Sentinel-2 (HLS) imagery, TanDEM-X InSAR coherence, and will also integrate climatic, topographic, and space-borne 3D lidar data. The architecture employs a multi-modal input processor, FM encoder, adaptive fusion neck, and task-specific prediction heads, trained via masked autoencoder pretraining followed by supervised fine-tuning. Training leverages JSC's JUWELS Booster and the forthcoming JUPITER exascale system.

BioMassters, a dataset that encompasses satellite imagery and associated forest biomass estimates for large-scale above-ground biomass mapping, provides an ideal initial evaluation framework for 3D-ABC for several compelling reasons.

Above Ground Biomass (AGB) estimation represents a core downstream task for carbon monitoring. BioMassters specifically targets this capability using Sentinel-1 SAR and Sentinel-2 MSI time series, modalities that overlap substantially with 3D-ABC's input data streams. This alignment allows direct assessment of whether 3D-ABC's learned representations capture vegetation structure and biomass-relevant features.

The dataset derives AGB labels from Finnish Forest Centre airborne LiDAR campaigns at 5 points per square meter density, combined with field measurements and calibrated allometric equations. This produces reference data with approximately 8% RMSE for key tree attributes, far more reliable than existing global products and essential for meaningful foundation model evaluation.

With 310,000 patches of size 224x224 covering 8 million hectares across five years, BioMassters offers the statistical power needed to assess foundation model generalization. The temporal dimension, 12 monthly observations per sample, tests whether 3D-ABC effectively captures phenological dynamics crucial for vegetation monitoring. Beyond its scale and temporal richness, BioMassters also benefits from a strong benchmarking ecosystem.

The NeurIPS 2023 competition produced well-documented baseline performance: U-TAE achieved 27.49 t/px RMSE overall, with results stratified by biomass density (15.24 t/px for low density, 37.59 t/px for high density). These benchmarks enable rigorous comparison of 3D-ABC against state-of-the-art task-specific models.

Current global biomass products operate at 100m resolution with RMSE values of 30-50 t/px. BioMassters operates at 10m resolution, allowing assessment of whether 3D-ABC's multimodal fusion can advance both accuracy and spatial detail simultaneously.

The dataset reveals where current approaches struggle, accuracy degrades with increasing forest density due to SAR backscatter and MSI reflectance saturation. This provides a specific challenge for 3D-ABC's multi-modal fusion architecture, and in future work we will be testing whether incorporating additional modalities (particularly 3D space-borne lidar) addresses these saturation effects.

While BioMassters covers boreal forests exclusively, it establishes whether 3D-ABC's pretrained representations provide a foundation for fine-tuning to other biomes, a critical test of foundation model utility before deploying resources on global-scale evaluation, e.g. in the arctic region. 

How to cite: Hashim, S., Mandal, S., Sedona, R., Zandi, E., and Cavallaro, G. and the 3D-ABC Team: BioMassters as Initial Benchmark for 3D-ABC, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22441, https://doi.org/10.5194/egusphere-egu26-22441, 2026.

X1.66
|
EGU26-21845
|
Vinicius do Carmo Melicio, Vitor Hugo Miranda Mourão, Luis Gustavo Barioni, and João Paulo Gois

Limited data and high sampling costs challenge soil carbon modeling. While previous generative AI methods, such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs), are commonly used, this study benchmarks Flow Matching's effectiveness for modeling complex soil data distributions. We introduce an Unconditional Flow Matching framework using the LUCAS soil dataset. Our procedures encompass: (a) training models without labels; (b) generating synthetic data, and (c) applying identical clustering protocols to the datasets generated in (a) and (b). Model performance is assessed through statistical divergence and cluster consistency between observed and synthetic data distributions. The goal is to determine if Flow Matching provides a more robust and accurate method for generating realistic soil carbon datasets.

How to cite: do Carmo Melicio, V., Mourão, V. H. M., Barioni, L. G., and Gois, J. P.: Unsupervised Manifold Learning: Validating Unconditional Flow Matching for Soil Carbon Data Topology, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21845, https://doi.org/10.5194/egusphere-egu26-21845, 2026.

Login failed. Please check your login data.