BG9.7 | Large-scale mapping of environmental variables by combining ground observations, remote sensing, and machine learning
Large-scale mapping of environmental variables by combining ground observations, remote sensing, and machine learning
Convener: Jacob Nelson | Co-conveners: Benjamin Dechant, Hanna Meyer, Alvaro Moreno
Orals
| Thu, 07 May, 14:00–15:45 (CEST)
 
Room 1.14
Posters on site
| Attendance Fri, 08 May, 14:00–15:45 (CEST) | Display Fri, 08 May, 14:00–18:00
 
Hall X1
Posters virtual
| Tue, 05 May, 14:48–15:45 (CEST)
 
vPoster spot 2, Tue, 05 May, 16:15–18:00 (CEST)
 
vPoster Discussion
Orals |
Thu, 14:00
Fri, 14:00
Tue, 14:48
Environmental data from large measurement campaigns and automated measurement networks are increasingly available and provide relevant information of the Earth System. However, such data are usually only available as point observations and only represent a small part of the Earth´s surface. Upscaling strategies are hence needed to provide continuous and comprehensive information as a baseline to gain insights on large-scale spatio-temporal dynamics. In the upscaling, machine learning algorithms that can account for complex and nonlinear relationships are increasingly used to link remote sensing datasets to reference measurements. The resulting models are then applied to provide spatially explicit predictions of the target variable, often even on a global scale.
Due to easy access to user-friendly software, model training and spatial prediction using machine learning algorithms is nowadays straightforward at first sight. However, considerable challenges remain: dealing with reference data that are not independent and identically distributed, accounting for spatial heterogeneity when scaling reference measurements to the grid cell scale, appropriately evaluating the resulting maps and quantifying their uncertainties, generating robust maps that do not suffer from extrapolation artifacts as well as the strategies for model interpretation and understanding.
This session invites contributions on the methodology and application of large-scale mapping strategies in different disciplines, including vegetation characteristics such as foliar or canopy traits and photosynthesis, soil characteristics such as soil organic carbon, or atmospheric parameters such as pollutant concentration. Methodological contributions can focus on individual aspects of the upscaling approach, such as the design of measurement campaigns or networks to increase representativeness, novel algorithms or validation strategies, feature attribution/explainability as well as uncertainty assessment.

Orals: Thu, 7 May, 14:00–15:45 | Room 1.14

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears 15 minutes before the time block starts.
Chairpersons: Jacob Nelson, Benjamin Dechant, Hanna Meyer
14:00–14:10
|
EGU26-16137
|
On-site presentation
How can we better benchmark global GPP data using flux tower measurements across timescales?
(withdrawn)
Xuanlong Ma, Yu Liang, Youngryel Ryu, and Kazuhito Ichii
14:10–14:20
|
EGU26-13953
|
ECS
|
Virtual presentation
Xian Wang, Kim Novick, and Mallory Barnes

Nature-based climate solutions, including reforestation, require credible carbon accounting frameworks that capture ecosystem-scale carbon fluxes and ensure additionality. However, most existing baselines rely on static biomass estimates that overlook spatial heterogeneity and interannual variation in forest carbon uptake. Here, we present a data-driven framework for estimating monthly Net Ecosystem Productivity (NEP) across eastern U.S. forests at 500-m resolution from 2003 to 2023. We trained Random Forest models using observations from 47 eddy-covariance sites combined with gridded remote sensing and meteorological data. Feature selection and SHAP analyses highlight NDVI, LAI, solar-induced fluorescence, shortwave radiation, and vapor pressure deficit as the primary drivers of NEP. Our results show that eastern U.S. forests have continued to strengthen as a carbon sink over the past two decades, with a mean NEP of −195 ± 122 g C m⁻² yr⁻¹ and an increasing trend of 2.51 g C m⁻² yr⁻¹. Annual NEP exhibits strong year-to-year sensitivity to spring temperature and moisture anomalies, with extreme events causing large variations in carbon uptake that are often followed by partial or full summer recovery, reflecting considerable ecosystem resilience. The substantial spatial and temporal variability in NEP predictions underscores the need for regionally calibrated, observation-based baselines. Our framework supports this need by providing dynamic, annually updated maps of forest carbon uptake to improve evaluation of reforestation and other nature-based climate solutions in the eastern United States.

How to cite: Wang, X., Novick, K., and Barnes, M.: Advancing Forest-Based Climate Solutions through Data-Driven Carbon Flux Estimation , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13953, https://doi.org/10.5194/egusphere-egu26-13953, 2026.

14:20–14:30
|
EGU26-9733
|
ECS
|
On-site presentation
Alexandra Runge, Viola Heinrich, Simon Besnard, Emil Cienciala, Kevin Black, Roberto Pilli, Gherardo Chirici, Giovanni D'Amico, and Martin Herold

Forests play a critical role in the global carbon cycle, yet carbon removals in Europe are declining due to increasing wood demand, natural disturbances, and a growing share of aging forests. Sustaining and enhancing forest carbon sinks requires a better understanding of forest structure complexity, which underpins accurate carbon estimates and aligns with emerging EU policy priorities such as identifying old-growth, natural, and even-aged forests. 

Forest inventory surveys provide essential ground-based information for evaluating forest structure complexity. Remote sensing data enables consistent and timely large-scale assessments. Therefore, our objective is to assess the applicability of integrating NFI and GEDI data for characterising forest structure complexity, particularly for distinguishing low and high structural complexity forests. We evaluate the availability of matched NFI plots and high-quality GEDI shots, derive a forest structure complexity measure from integrated variables, and demonstrate a machine learning model trained on NFI-GEDI data to classify forest structure complexity. This study covers Czech Republic, Italy, and Spain, representing temperate, mountainous, and Mediterranean biomes.

We initially identified about 34,000 NFI plots that had a geographic match with almost 90,000 GEDI shots (from a total of ~64,000 NFI plots available and ~200,000,000 GEDI shots in Spain, Italy, and Czech Republic). Rigorous GEDI quality filtering and additional matching criteria reduced the dataset to a total of 2,509 NFI plots and 5,488 corresponding GEDI shots. This is 7% of the NFI plots and 6.5% of the geographically matched GEDI shots. This highlights that data quality requirements reduce the number of matched plots and GEDI shots drastically. Therefore, the data base for assessments of individual countries is low, and a pan-European assessment favourable. 

Forest structure complexity was derived at the plot level using variability in diameter at breast height, tree height, and species richness, combined into an equally weighted structure complexity score. Low variability indicated even-aged, single-species stands, whereas high variability reflected diverse, multi-aged, structurally complex forests. We selected the NFI plots within the lowest and highest 25 % structure complexity score for low and high structural complexity, respectively. 

Training a Support Vector Machine with GEDI data to differentiate between low and high structural complexity, as derived from the NFI-based score, resulted in a model accuracy of 0.81. Restricting the evaluation to the predictions with probabilities > 80% increased the accuracy to 0.94. Applying this model to high-quality GEDI shots in Italy, Czech Republic, and Spain highlights the country-wide occurrence and distribution of low and high structural complex forests. A first assessment indicates that 86%, 65%, and 26% of the forest areas are associated with high structural complex forests in Czech Republic, Italy, and Spain, respectively. 

These results demonstrate the potential of integrating ground-based data with spaceborne-lidar to characterise forest structure complexity. Even simple structure scores and models provide a reliable indication of the structural complexity distribution across Europe. This approach provides a new basis for improving carbon estimates, monitoring structural changes driven by disturbances and other changes, and supporting EU forest-policy targets related to biodiversity, climate resilience, and sustainable forest management.

How to cite: Runge, A., Heinrich, V., Besnard, S., Cienciala, E., Black, K., Pilli, R., Chirici, G., D'Amico, G., and Herold, M.: Integrating forest inventory plot and GEDI data for forest structure assessment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9733, https://doi.org/10.5194/egusphere-egu26-9733, 2026.

14:30–14:50
|
EGU26-20132
|
ECS
|
solicited
|
On-site presentation
Lily-belle Sweet and Jakob Zscheischler

Identifying the weather conditions that lead to crop yield failure is critical for early warning systems and climate adaptation planning. However, yield at harvest time is driven by nonlinear interactions between weather and other variables across different stages of plant development. While machine learning models excel at capturing such complex relationships from high-dimensional data, they can easily overfit to the dependencies inherent to spatiotemporal agroclimatic data. We apply a data-driven framework to multivariate observational data to identify key climate drivers of wheat yield failure in Europe. The method, previously validated using process-based crop model simulations, yields parsimonious sets of drivers that are able to effectively reproduce interannual variability, based on their contribution to the predictive performance of models across held-out spatial regions and years and in combination with different sets of predictive features. The resulting drivers are physically interpretable and align with agronomic understanding. In addition, using both observational data and process-based model simulations, we explore the impact of different model evaluation strategies on the drivers that are identified and the transferability of resulting models to unseen regions. The approach allows researchers to exploit the information available in high-resolution multivariate datasets using machine learning, while making use of parsimonious, interpretable statistical models. Beyond agriculture, this framework may be useful for the study, modelling and mapping of other societally relevant climate impacts, such as forest mortality, wildfires, floods, and landslides.

How to cite: Sweet, L. and Zscheischler, J.: Identifying robust climate drivers of wheat yield failure in Europe from high-dimensional, multivariate spatiotemporal data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20132, https://doi.org/10.5194/egusphere-egu26-20132, 2026.

14:50–14:55
14:55–15:05
|
EGU26-11956
|
ECS
|
On-site presentation
Arianna Lucarini, Daniel E. Pabon-Moreno, Costantino Sirca, Donatella Spano, and Gregory Duveiller

Eddy Covariance (EC) towers measure ecosystem-atmosphere fluxes and are typically installed in homogeneous landscapes to ensure representativeness. However, sometimes the landscapes can exhibit more heterogeneity than desired, especially when the objective is to link these fluxes with other data sources, such as coarse remote sensing observations, notably in efforts to upscale these fluxes. Accurately integrating EC flux measurements with satellite observations or model-based simulations remains a significant challenge due to the inherent spatial heterogeneity that can occur within the flux footprint. This footprint is also dynamic, changing according to meteorological conditions such as wind speed and direction, while many approaches consider it static for simplicity. This study examines whether modelling the footprint dynamics and describing the underlying fine-scale spatial variability information from remote sensing data, specifically using 20 m Sentinel-2 data as a proxy for the spatial heterogeneity in vegetation structure, can help explain the high frequency (e.g., half-hourly) variability of Gross Primary Production (GPP) estimates from EC. To isolate the contribution of spatial heterogeneity from the dominant effect of incoming radiation, we work on light-normalized fluxes (i.e., GPP/PAR, as a proxy for light-use efficiency) measured at the tower. We hypothesize that combining light-normalized EC fluxes with remote sensing information weighted by dynamically modelled flux footprints provides a more accurate representation of the high-frequency variations in GPP than approaches relying on static footprint representations.

To test our hypothesis, we analyze three ICOS sites characterized by distinct ecosystem types: (i) IT-Noe, a Mediterranean maquis in Italy; (ii) ES-LMa, a typical holm oak savanna in Spain; and (iii) IT-Ren, a subalpine forest in Italy. Our methodology integrates half-hourly EC datasets for GPP and meteorological variables with Sentinel-2 data cube at 20 m spatial resolution to compute various Vegetation Indices (VIs), including: NDVI, EVI, CIR, NDWI, and NIRv. We compare three footprint modeling approaches: (i) Static Footprint (SF), a fixed-area approach with radii of 50, 250, and 500m; (ii) Climatological Footprint (CF), based on the Flux Footprint Prediction (FPP) model by Kljun et al. (2015) applied as an average over the growing season; and (iii) Dynamic Footprint (DF), providing a dynamic representation of flux for each Sentinel-2 band every 30 minutes.

Preliminary results indicate that incorporating high-resolution Sentinel-2 data to explicitly account for spatial heterogeneity within the flux footprint provides substantial added value for the ecosystem flux studies. The comparison between footprint-based approaches and simplified assumptions highlights the importance of capturing fine-scale spatial variability to ensure accurate estimates of GPP, particularly in complex and heterogeneous landscapes.

How to cite: Lucarini, A., E. Pabon-Moreno, D., Sirca, C., Spano, D., and Duveiller, G.: Investigating whether considering spatial heterogeneity within Eddy-Covariance tower footprints can better characterise high-frequency changes in GPP, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11956, https://doi.org/10.5194/egusphere-egu26-11956, 2026.

15:05–15:15
|
EGU26-9141
|
On-site presentation
Mark Schlutow, Ray Chew, and Mathias Göckede

Eddy covariance (EC) measurement sites are often located in heterogeneous terrain where aggregated ecosystem-exchange fluxes are observed originating from a mosaic of structured patches of different land cover types and mixed ecosystems, which may even exhibit sources and sinks simultaneously. This complex spatial heterogeneity makes it challenging to identify controls and processes governing carbon cycle processes of homogeneous sub-units surrounding the tower. As a consequence, for spatiotemporal upscaling of fluxes to large-scale maps any given tower is strictly speaking only representative for the exact same mixture of patches as found in the tower footprint.

We present FLUGS, a novel framework that infers land-cover-specific ecosystem-exchange fluxes provided the EC time series of aggregated fluxes and the land cover map of the ecosystem surrounding the EC tower. Using a multitask machine learning approach based on Kernel Ridge Regression combined with high-resolution flux footprints, FLUGS learns the environmental response functions (ERFs) from EC data for each land cover class simultaneously. The approach is versatile, robust to multicollinearity and yields smooth and interpretable ERFs with a unique global optimum. By offering a fast, transparent workflow for spatially decomposing ecosystem fluxes, FLUGS opens new opportunities to attribute EC fluxes to ecological processes, benchmark land-surface models and improve our understanding of land-atmosphere interaction. In terms of data coverage, applying spatial flux decomposition with FLUGS to a single tower effectively multiplies its scientific value, providing land-cover-specific insights equivalent to operating two or more conventional towers, one for each patch type individually. The FLUGS framework is validated against synthetic and real data experiments. The latter uses data from a twin tower site in Northeast Siberia and the STORDALENX25 campaign. Machine learned patch-level ERFs from FLUGS may be used directly for upscaling.

How to cite: Schlutow, M., Chew, R., and Göckede, M.: “Take one, get two!” - Spatial flux decomposition for eddy covariance towers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9141, https://doi.org/10.5194/egusphere-egu26-9141, 2026.

15:15–15:25
|
EGU26-1419
|
ECS
|
Virtual presentation
Statistical Upscaling of Point-Based Sediment Observations to Continental-Scale Maps Using 40 Years of Landsat and Explainable Machine Learning
(withdrawn)
Gültekin Erten
15:25–15:35
|
EGU26-14446
|
ECS
|
On-site presentation
Wenfu Sun, Frederik Tack, Lieven Clarisse, and Michel Van Roozendael

Machine learning has become an important tool for producing high-resolution environmental maps, as traditional chemistry-transport models often face limitations in computational cost and spatial detail at the kilometer scale and hourly resolution. At such high spatiotemporal resolution, target fields become highly dynamic and spatially heterogeneous, while ground observations remain sparse. This raises a key question: how can we improve physical consistency and recover realistic spatial structure (e.g., transport-related spatial patterns) when reconstructing high spatiotemporal resolution fields from sparse stations?

We address this question by systematically comparing three machine-learning models for hourly surface mapping of NO2, a critical air pollutant, at 2 km resolution over Western Europe. All models use the same inputs, including static emission-related fields, satellite remote-sensing products, and meteorological variables, constrained by ground-based measurements from the European Environment Agency’s AirBase network.

Model A is trained using station observations only. Model B extends Model A by introducing wind-driven advection encoding to explicitly consider atmospheric transport. Model C further builds on Model B by incorporating a pretraining stage informed by hourly gridded NO2 fields at a coarser resolution (10 km) from the Copernicus Atmosphere Monitoring Service (CAMS) European reanalysis. Model B and Model C represent two physics-guided machine learning paradigms.

In the study region, Model A and Model B show similar predictive performance at unobserved stations and similar structural similarity to CAMS fields, while Model C performs best. However, compared to Model A, both Model B and Model C can reproduce plume-like structures that respond coherently to wind-field perturbations, such as changes in plume orientation under altered wind directions. We have also conducted a transfer learning experiment in Central Europe and found that Model C achieves the highest transferability in terms of maintaining spatial structure.

Overall, our results demonstrate that, at high spatiotemporal scales, although including simple advection physics can recover the pollutant's transport, training on stations alone is insufficient to capture dynamics and physically plausible patterns. In contrast, pretraining with large-scale simulation data can more significantly improve spatial structure, physical sensitivity, and transferability, as well as station-based metrics. Our study highlights the importance of pretraining with large-scale simulations for improving physically consistent, transferable learning in complex environmental systems with sparse observations.

How to cite: Sun, W., Tack, F., Clarisse, L., and Van Roozendael, M.: Physics-guided machine learning improves spatial structure and transferability in high-resolution NO2 mapping under sparse observations, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14446, https://doi.org/10.5194/egusphere-egu26-14446, 2026.

15:35–15:45
|
EGU26-20910
|
ECS
|
On-site presentation
Vitus Benson, Martin Jung, Sebastian Hoffmann, Christian Reimers, Alexander J. Winkler, Qi Yang, and Markus Reichstein

Mapping ecosystem properties and functioning from Earth observation data remains fundamentally an extrapolation problem. Ground-based measurements of ecosystem processes, such as carbon fluxes, are sparse and geographically biased towards the Global North. Machine-learning models are therefore trained on limited labeled data and subsequently upscaled globally using environmental covariates derived from satellite remote sensing and reanalysis products. A central challenge is ensuring that such models generalize robustly beyond their training domain, rather than exhibiting spurious confidence or biased predictions in poorly observed regions.

In this contribution, we explore how recent advances in mechanistic interpretability and self-supervised representation learning from AI safety research can help address these challenges. In particular, sparse autoencoders (SAEs), and more specifically Top-K sparse autoencoders, have recently been used to disentangle deep neural representations into interpretable and steerable concepts in large language models. We propose to adapt these methods to Earth system science, with the goal of learning sparse, disentangled, and spatially meaningful latent representations of ecosystem-relevant variables.

We first evaluate this approach on a self-supervised proxy task: compressing and reconstructing mean seasonal cycles derived from MODIS remote sensing products and ERA5 climate reanalysis data. Using a Matryoshka BatchTopK SAE, we obtain latent features that are highly localized in space, with individual features activating only over specific regions of the Earth. In contrast to dense embeddings, e.g. from variational auto-encoders, our approach offers a control on the average sparsity level. In other words, this intrinsic, data-driven partitioning of geographic space can be interpreted as emergent climate regimes or ecosystem types, without relying on predefined biome maps or expert labels. 

Building on these results, we apply the SAE framework to the mapping of ecosystem carbon fluxes, using FluxNet tower observations as ground truth. The sparse and disentangled latent structure provides a transparent link between remote sensing inputs and predicted ecosystem functioning. Simultaneous training on a self-supervised reconstruction task and on predicting net ecosystem exchange provides competitive performance, with the sparsity of the features offering a promising avenue to enhance robustness by controlling the extrapolation behavior of the neural network. Beyond predictive performance, we introduce an interpretability workflow that enables systematic inspection of learned features, supporting model diagnostics and scientific analysis.

Overall, we argue that self-supervised, interpretable representation learning offers a promising pathway toward robust global ecosystem mapping from both labeled and unlabeled satellite data. This approach leverages the full scale of Earth observation archives while improving trust and insight in mapping ecosystem properties and functioning. In addition, it sheds insight into geographical partitioning, offering a novel perspective on decade-old maps of plant functional types.

How to cite: Benson, V., Jung, M., Hoffmann, S., Reimers, C., Winkler, A. J., Yang, Q., and Reichstein, M.: Mechanistic Interpretability for Mapping Ecosystem Functioning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20910, https://doi.org/10.5194/egusphere-egu26-20910, 2026.

Posters on site: Fri, 8 May, 14:00–15:45 | Hall X1

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Fri, 8 May, 14:00–18:00
Chairpersons: Hanna Meyer, Alvaro Moreno, Jacob Nelson
X1.86
|
EGU26-3585
|
ECS
Gayathri Girish Nair, Camille Abadie, Midori Yajima, Luke Daly, and Silvia Caldararu

Plant morphological and physiological trait combinations exist on an almost continuous spectrum across varying climate conditions and geographic locations. Currently, a dominant but limiting approach to capturing this diversity, for example within Earth System Models (ESMs), is to discretize into few largely arbitrary Plant Functional Type (PFT) categories (e.g. tropical broad-leaved deciduous, C3 grass, temperate needle-leaved evergreen, etc.) based on broad functional similarities and responses to the environment, leading to much information loss.

Given recent advances in generative Artificial Intelligence (AI), it is now possible to develop Deep Learning (DL) models that can learn the distribution of plant trait vectors conditioned under varying environmental factors. This work explores using generative modelling approaches like conditional variational autoencoders / flow matching to train a Neural Network (NN) to learn the joint distribution of 26 plant traits as in the TRY Plant Trait Database under different environmental conditions across the globe. Generation is conditioned on climate variables from the ERA5-Land reanalysis dataset and Copernicus Digital Elevation Model fetched via Google Earth Engine alongside soil properties obtained from the ISRIC WISE30sec dataset.

Outputs of such a trained model can contribute towards downscaling and gap-filling approaches, as well as studies trying to understand plant responses under changing climate conditions. Furthermore, trained hidden layer output embeddings, being Continuous Plant Trait Vectors (CPTVs), better capture the spectrum of varying trait combinations. Such information-rich CPTVs have the potential to  be viable alternatives to PFT classes w.r.t parameterization of Earth System Functions within ESMs. The model itself serves as a tool for furthering understanding of plant functional adaptations through exploration of the learned trait space via cluster analysis, enabling the identification of latent structure, relationships, and patterns, as well as supporting hypothesis generation and comparative analysis across populations or conditions.

How to cite: Girish Nair, G., Abadie, C., Yajima, M., Daly, L., and Caldararu, S.: Continuous Plant Trait Vectors using Generative AI, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3585, https://doi.org/10.5194/egusphere-egu26-3585, 2026.

X1.87
|
EGU26-16811
|
ECS
Theo Glauch, Julia Marshall, and Marcia Kroker

High-resolution estimates of ecosystem carbon dioxide exchange are essential for interpreting atmospheric CO₂ observations and quantifying natural and anthropogenic carbon budgets across spatial scales. Most state-of-the-art data-driven biosphere models rely on MODIS or VIIRS products at 500 m resolution to upscale eddy-covariance flux measurements, despite the strong spatial heterogeneity of many landscapes and the limited representativeness of individual flux towers. Recent advances in satellite remote sensing, particularly the Sentinel-2 constellation, enable data-driven upscaling of ecosystem carbon fluxes at 10 m resolution and offer new opportunities to better align reference measurements with model inputs.

In this contribution, we present a novel explainable machine-learning framework that combines Sentinel-2 observations with meteorological data to predict net ecosystem exchange, gross primary productivity, and ecosystem respiration across a wide range of ecosystem types in Europe, including different crop species. A key methodological aspect is the explicit alignment of eddy-covariance footprint estimates with high-resolution Sentinel-2 data, which improves model training under non-independent and spatially heterogeneous reference data conditions typical of European landscapes.

We demonstrate that this footprint-aware upscaling strategy leads to improved flux estimates and more robust spatial predictions. Using explainable AI techniques, we further analyse feature contributions and extract ecosystem-specific temperature dependencies of photosynthesis and respiration, enhancing process understanding beyond purely predictive performance. Finally, we show how the resulting models can be applied to generate spatially explicit CO₂ flux maps from urban to continental scales while accounting for the representativeness of individual flux towers and reducing extrapolation artefacts.

How to cite: Glauch, T., Marshall, J., and Kroker, M.: High-resolution upscaling of ecosystem carbon fluxes using Sentinel-2 and explainable AI, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16811, https://doi.org/10.5194/egusphere-egu26-16811, 2026.

X1.88
|
EGU26-19861
Julia Marshall, Theo Glauch, Marcia Kroker, and Philina Voss

The Vegetation Photosynthesis and Respiration Model (VPRM) is a data-driven light-use-efficiency model for estimating biospheric carbon dioxide fluxes based on satellite-derived vegetation indices, such as the Enhanced Vegetation Index (EVI) and the Land Surface Water Index (LSWI), which provide high spatial resolution information on land surface conditions. High temporal resolution is achieved through meteorological driving data, including 2 m air temperature and surface shortwave radiation. The model parameters are calibrated for each vegetation type using regional eddy-covariance flux measurements from previous years. VPRM is a well-established approach that has been widely applied to quantify gross primary productivity and ecosystem respiration and to interpret atmospheric CO₂ concentration measurements in terms of biogenic and anthropogenic flux contributions. In many cases VPRM fluxes are also used as priors for atmospheric inversions.

pyVPRM is an open and modular Python-based framework that facilitates the application of VPRM across a wide range of spatial scales, from urban domains to continental and global analyses. Its flexible design allows users to combine different satellite products (e.g. MODIS, VIIRS, Sentinel-2), land-cover classifications (e.g. ESA WorldCover, Copernicus Dynamic Land Cover, MapBiomas), and meteorological data sources (e.g. local observations or reanalysis products such as ERA5).

In this poster, we present recent developments in the pyVPRM framework, demonstrate typical application workflows, and discuss best practices for model configuration and evaluation. A central aim of this contribution is to engage with the user community, gather feedback on current capabilities and limitations, and discuss future directions for collaborative model development and applications.

How to cite: Marshall, J., Glauch, T., Kroker, M., and Voss, P.: Using the pyVPRM framework to estimate biospheric carbon fluxes from city to global scales, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19861, https://doi.org/10.5194/egusphere-egu26-19861, 2026.

X1.89
|
EGU26-79
|
ECS
Antonio Vidal Llamas, Carolina Acuña-Alonso, Diego Barba-Barragáns, and Xana Álvarez

 

Land use changes are one of the main drivers of global change, occurring at an accelerating rate. Therefore, obtaining accurate and up-to-date knowledge of the Earth's surface is essential. This paper aims to produce a land cover map for the Guadiana Hydrographic Demarcation (Spain), a region under diverse environmental pressures and part of one of the largest basins on the Iberian Peninsula. A 1D convolutional neural network (1D-CNN) deep learning method was applied to Sentinel-2 satellite imagery, yielding promising results with high accuracy when compared to other methods. A land cover map for the summer of 2022 was generated with a resolution of 10 x 10 m. Several differences were detected in the coverage of various classes when compared to the previously available data from the Spain's Land Occupation Information System (SIOSE) 2014 reference layer. Notably, “agricultural lands”, which cover more than half of the study area, showed a 7.34 % increase, while “broadleaf” areas exhibited a 7.75 % decrease over the total study area. Greater congruences were found in the larger classes between the two maps. The methodology demonstrated a remarkably high accuracy of 0.96. However, only 59.97 % agreement with the SIOSE layer was observed, due to differences in time periods, minimum representation sizes, and classification accuracies. The high accuracy achieved over such a large area underscores the potential of Sentinel imagery and neural networks for land cover classification, addressing some of the limitations of existing land cover products.

How to cite: Vidal Llamas, A., Acuña-Alonso, C., Barba-Barragáns, D., and Álvarez, X.: Applying a unidimensional convolutional neural network for accurate land cover mapping in large areas: A case of study of the Guadiana Hydrographic Demarcation (Spain), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-79, https://doi.org/10.5194/egusphere-egu26-79, 2026.

X1.90
|
EGU26-4160
Jungho Im, Bokyung Son, Taejun Sung, and Sejeong Bae

With the increasing emphasis on climate change and carbon neutrality, accurately quantifying gross primary productivity (GPP) has become a key strategic objective. The spatiotemporal variability of GPP across vegetation types underscores the necessity of high-resolution data for precise estimation. While satellite imagery is a valuable tool for large-scale GPP monitoring, its effectiveness is constrained by trade-offs between spatial and temporal resolution, particularly impacting accuracy in heterogeneous vegetated areas. To address this limitation, we proposed a novel framework named UNified, high-resolution Intelligent carbon QUantification and Estimation (UNIQUE), which generates 30 m GPP maps by learning the spatial relationships between daily 500 m MODIS and 16-day 30 m Landsat imagery. The UNIQUE framework comprises two steps. In the first step, two independent artificial intelligence models were developed to estimate daily GPP using MODIS and Landsat vegetation indices tailored to their respective temporal resolutions, combined with a reanalysis of meteorological data. These models were trained and validated using 309 eddy- covariance flux observations from the Northern Hemisphere. As a result, GPPM represents the AI-based GPP estimated from MODIS data, while GPPL represents the AI-based GPP estimated from Landsat data. Among the various AI algorithms tested using AutoML packages, the light gradient boosting machine model demonstrated the best performance. For GPPM, it achieved an r of 0.80 and a root mean squared error (RMSE) of 2.47 gC/m2/day from a 20-fold spatial cross-validation. Similarly, for GPPL, the model achieved an r of 0.83 and an RMSE of 2.43 gC/m2/day. In the second step of UNIQUE, we downscaled GPPM to produce GPPL-like daily 30 m GPP maps using a generative AI model, the denoising diffusion probabilistic model (DDPM). This process was applied to South Korea, which is characterized by dominant mountainous regions and heterogeneous land cover. To produce reliable 30 m GPP maps corresponding to real-world land cover, two schemes were employed: (1) a DDPM model that uses only GPPM as input (GPPUNIQUE (S1)) and (2) a DDPM model incorporating high-resolution spatial topography information from 30 m digital elevation models and fractional land cover ratios within 30 m, derived from 1 m land cover data provided by the Korean Ministry of Environment (GPPUNIQUE (S2)). Training data were randomly extracted as 150 by 150-pixel patches, each covering 4,500 m × 4,500 m from 2020 to 2022. The test dataset was constructed using data from 2023. GPPUNIQUE (S2) outperformed both GPPUNIQUE (S1) and GPPM, demonstrating the lowest average RMSE (2.24 gC/m2/day). In contrast, GPPUNIQUE (S1) showed an RMSE of 3.36 gC/m2/day, which is a higher value compared to GPPM, which had an RMSE of 2.85 gC/m2/day. Incorporating auxiliary variables with high spatial information—here, topography and fractional land cover data—proved to be essential for producing stable generated images that accurately correspond to real-world land cover. GPPUNIQUE (S2) effectively identified carbon absorption sources that were previously undetectable with MODIS data alone. Furthermore, this approach enabled the analysis of spatiotemporal characteristics of GPP across different plant functional types, facilitating enhanced high-resolution carbon flux monitoring in diverse ecosystems.

How to cite: Im, J., Son, B., Sung, T., and Bae, S.: Deep learning framework for high spatiotemporal resolution monitoring of carbon uptake usng multi-source satellite imagery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4160, https://doi.org/10.5194/egusphere-egu26-4160, 2026.

X1.91
|
EGU26-6380
Dan-Xia Song, Ziyi Chen, Sixuan Qi, and Tao He

Fractional vegetation cover (FVC) is widely used to characterize vegetation conditions, yet its accuracy in mountainous regions remains highly uncertain due to complex terrain effects. Focusing on the Hi-GLASS FVC product, this study evaluates its performance in mountainous regions and proposes two improvement methods: a terrain-correction method (TC) and a multi-feature fusion method (MF). In the TC method, terrain-corrected surface reflectance is used as input to the Hi-GLASS FVC model. The MF method improves FVC estimation by incorporating multiple additional features, including observation geometry, topographic parameters, and vegetation indices. It is implemented as two models: a full-feature model (MF-ALL) and an optimized model using recursive feature elimination (MF-RFE). Using very high resolution (VHR) reference data, we quantitatively evaluated the accuracy of the two methods (TC and MF) over mountainous regions in China and the United States. The results reveal notable regional differences. In China, the MF-RFE model achieved the best performance, increasing R² by 62% relative to Hi-GLASS, slightly outperforming the MF-ALL model, while the TC method improved overall accuracy but reduced R² on sunny slopes by approximately 14%. In the United States, the MF-ALL model performed best, increasing R² by 42% over Hi-GLASS and slightly surpassing MF-RFE, whereas the TC method led to an overall accuracy decline. Further analysis showed that topography and vegetation type significantly influenced FVC estimation accuracy. Higher accuracy was generally observed on sunny slopes compared with shady slopes, with greater relative improvements on shady slopes; accuracy decreased with increasing slope; and forests exhibited larger improvements than non-forest vegetation types. Overall, the MF method substantially enhances the accuracy and robustness of mountainous FVC estimation compared with the TC method, providing a reliable framework for vegetation monitoring, carbon cycle assessment, and ecosystem management under complex terrain conditions.

How to cite: Song, D.-X., Chen, Z., Qi, S., and He, T.: Improving and validating the Hi-GLASS FVC product over mountainous regions in China and the United States using very-high-resolution satellite imagery, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6380, https://doi.org/10.5194/egusphere-egu26-6380, 2026.

X1.92
|
EGU26-6504
|
ECS
Seoyeong Ku, Jongjin Baik, Seunghyun Hwang, and Changhyun Jun

Gross Primary Productivity (GPP) plays a central role in regulating terrestrial carbon uptake, yet commonly used satellite-based GPP products are provided at multi-day temporal resolutions, limiting their ability to capture rapid ecosystem responses to short-term environmental variability. This temporal constraint is particularly critical under increasing occurrences of extreme weather events, where sub-daily vegetation dynamics remain poorly understood. In this study, we propose a machine-learning-based framework to generate hourly GPP estimates at moderate spatial resolution across the Korean Peninsula. The approach integrates satellite-derived vegetation indices with reanalysis-based hydrometeorological variables and explicitly accounts for land-cover heterogeneity by constructing independent models for major land-cover classes. To enhance model interpretability and efficiency, a feature selection strategy was applied to identify key environmental drivers of GPP variability for each land-cover type. Model performance was evaluated using temporally independent datasets, demonstrating that hourly GPP estimates aggregated to multi-day scales are consistent with existing satellite GPP products, while additionally capturing realistic diurnal cycles and seasonal patterns. The results indicate that a reduced set of influential variables can preserve predictive skill while improving computational efficiency. The proposed framework provides a practical pathway for temporally downscaling widely available satellite GPP products to sub-daily resolution in regions with limited ground observations. This capability offers new opportunities to investigate vegetation productivity responses to short-term climatic extremes such as heatwaves and droughts, contributing to improved understanding of ecosystem carbon dynamics under a changing climate.

 

Acknowledgement

This work was supported by the Korea Environmental Industry & Technology Institute (KEITI) through Wetland Ecosystem Value Evaluation and Carbon Absorption Value Promotion Technology Development Project, funded by Korea Ministry of Climate, Energy and Environment (MCEE) (RS-2022-KE002066), and supported by the National Research Foundation of Korea(NRF) funded by the Ministry of Education (RS-2024-00465925) and by the Korea government (MSIT) (RS-2024-00334564 & RS-2021-NR060085).

How to cite: Ku, S., Baik, J., Hwang, S., and Jun, C.: Ensemble machine learning for sub-daily downscaling of satellite-derived gross primary productivity, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6504, https://doi.org/10.5194/egusphere-egu26-6504, 2026.

X1.93
|
EGU26-12569
Sander Vos, Tegan Blount, Roderik Lindenbergh, José Antolinez, and Marco Marani

Salt marshes worldwide face ongoing climate change, including variations in local marine and meteorological forcing. Their resilience against relative sea level rise is partly dependent on organic soil production driven by vegetation development.

The Leaf Area Index (LAI) is a key indicator to quantify plant growth, ecosystem productivity and to characterize local vegetation distribution. However, area-wide LAI mapping from in situ measurements is challenging in inaccessible swampy and silty areas. Aerial/satellite mounted laser and imaging data have been used to augment in situ measured LAI values, but general methodology is lacking. Multi-sensor data fusion is an emerging area of research in improving LAI determination. In this abstract a novel data fusion technique is explored that uses an evolutionary AI model to map both  lidar 3D geometrical and multispectral vegetation data to LAI ground measurements.

A combined drone based survey acquiring both lidar and multispectral imagery (Green, Red, Red Edge and Near Infrared) was conducted in autumn 2025 at San Felice salt marsh in Venice Lagoon (Italy), a marsh shrinking and drowning due to microtide and reduced inorganic sediment input. Both lidar/multispectral flights were flown at around 30 meters above ground and processed into geo-referenced point clouds and multispectral orthomosaics.  Data sources were consequently merged into a multispectral point cloud by adding the nearest (in X-Y coordinate) multispectral information to each point in the point cloud. Ground based LAI in situ measurements were obtained in 40 vegetation patches spread out over the survey area.

The multispectral point cloud was subsequently divided into adjacent hexagonal cells (0.5m radius) with information per cell summarized by 19 parameters. Multispectral color (4 bands) information is reduced to a 4*4 averaged covariance matrix while a light reduction function (based on the Beer-Lambert law, 3 parameters) modeled the attenuation of Lidar returns with increasing height.

An Artificial Neural Network (ANN) model was trained using an evolutionary algorithm to find an optimized ANN model to couple multispectral point cloud parameters in the 40 ground patches to local LAI values. The model was varied in 1-3 hidden layers and 20 to 60 nodes per hidden layer.  Training data was split 80%-20% with 80% of the data used for training and the rest for prediction evaluation. The best model achieved a high prediction accuracy (R2=0.906, RMSE=0.11), but showed a tendency to underestimate LAI values possibly reflecting spectral saturation in denser vegetation. An example of a continuous salt marsh LAI map is shown in figure 1.
The data fusion approach offers a promising technique towards improved LAI mapping, contributing to a better understanding of salt marsh responses to climate change.

 

How to cite: Vos, S., Blount, T., Lindenbergh, R., Antolinez, J., and Marani, M.: Salt march Leaf Area Index determination with AI driven aerial lidar and multispectral data fusion, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12569, https://doi.org/10.5194/egusphere-egu26-12569, 2026.

X1.94
|
EGU26-17883
|
ECS
Franz Schulze, Johannes Loew, Julia Pöhlitz, and Christopher Conrad

Accurate crop rotation monitoring is essential for sustainable agricultural management, supporting policy compliance, soil health assessment, and climate-resilient farming practices. Earth observation-based crop classification has become operational across Germany, with models producing annual outputs since Sentinel-2's launch in 2017. While these systems report high single-year accuracies, their reliability for multi-temporal applications—particularly rotation pattern detection remains insufficiently evaluated. ­

This study assesses the performance of two operational German crop classification models from Thünen Institute and the German Aerospace Center (DLR) for rotation analysis in Saxony-Anhalt, testing their applicability beyond original training regions. We processed multi-year classification outputs (2017–2024) using CropRotViz, an open-source R package specifically designed for handling temporal intersection, change detection and rotation pattern visualization. Model outputs were validated against Land Parcel Identification System (LPIS) reference data, evaluating both spatial accuracy and temporal consistency—the latter being critical for reliable rotation monitoring. The rotation Sequences of 3, 4 and 5 years were analyzed.

Preliminary results revealed a significant performance gap between single-year classification accuracy and multi-year rotation detection reliability. The DLR and Thünen models achieve annual accuracies of 0.81–0.90, with variability across years and crop types. However, when comparing overlapping areas with LPIS data across multi-year sequences (3-, 4-, and 5-year rotations), accuracies dropped substantially to 0.36–0.57. These errors compound over time, limiting model utility for applications requiring temporal stability, such as crop diversification monitoring, compliance verification for sustainable farming schemes, or assessing rotation impacts on soil health and carbon sequestration.

Our findings highlight a critical challenge for operational EO-based agricultural monitoring: current validation frameworks emphasizing annual accuracy may inadequately assess suitability for sustainability-relevant applications requiring temporal field level consistency. To transition from observation to actionable agricultural management support, classification systems must explicitly optimize for temporal robustness. We recommend incorporating rotation-specific validation metrics and developing approaches that leverage temporal context during classification to enhance consistency.

This work contributes to improving large-scale agroecosystem monitoring capabilities by identifying limitations in current operational systems and providing methodological tools (CropRotViz) for temporal analysis. Enhanced rotation monitoring supports evidence-based sustainable management, from precision agriculture to policy evaluation for climate-resilient farming transitions.

How to cite: Schulze, F., Loew, J., Pöhlitz, J., and Conrad, C.: EO-Based Crop Classification for Rotation Monitoring – Evaluating Temporal Consistency of Operational Models for Sustainable Agricultural Management, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17883, https://doi.org/10.5194/egusphere-egu26-17883, 2026.

X1.95
|
EGU26-21461
|
ECS
Maurício Lima, Alexander Winkler, and Christian Reimers

Understanding and predicting the relation between plant productivity and meteorological drivers is central to ecosystem and climate science. Existing approaches fall into two broad categories: process-based models and data-driven models. Process-based models can represent causal relationships and allow users to prescribe and perturb variables, but at global scales they are either computationally expensive or simplified to the point that key processes and ecosystem diversity are lost. Data-driven models (e.g., FluxCom) produce only mean responses and therefore miss internal variability of meteorology, vegetation state, and fluxes. Because these approaches impose a fixed split between inputs and outputs, one must decide in advance which variables can be conditioned on and which will be predicted, which constrains the effect of perturbations and limits experimentations. We address these complementary shortcomings by developing a probabilistic model of vegetation and weather state variables using generative diffusion models trained on FluxNet data. As a consequence, the model can sample plausible trajectories that reflect the full distribution. We demonstrate two key capabilities. First, the model functions as a data-driven emulator that can be conditioned on specified inputs, such as prescribed temperature, radiation, or soil moisture, while producing ensemble outputs that capture uncertainty and internal state variability. This enables users to explore vegetation responses similarly to typical mechanistic models, but at a fraction of the computational cost and with observational grounding. Second, we exploit the stochastic model to analyze vegetation responses to extreme weather events. Unlike approaches predicting the mean, our diffusion-based emulator reveals how extreme meteorological inputs alter the tails of the vegetation response distributions. By bridging the gap between mechanistic workflows and data-driven models, our diffusion model offers a practical path toward both improved scientific understanding of vegetation–weather interactions and an operational product for future analyses, risk assessment, and scenario exploration.

How to cite: Lima, M., Winkler, A., and Reimers, C.: A Generative Framework for Vegetation–Weather Interactions and Extreme Response Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21461, https://doi.org/10.5194/egusphere-egu26-21461, 2026.

Posters virtual: Tue, 5 May, 14:00–18:00 | vPoster spot 2

The posters scheduled for virtual presentation are given in a hybrid format for on-site presentation, followed by virtual discussion on Zoom. Attendees are asked to meet the authors during the scheduled presentation & discussion time for live video chats; onsite attendees are invited to visit the virtual poster sessions at the vPoster spots (equal to PICO spots). If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access the Zoom meeting appears 15 minutes before the time block starts.
Discussion time: Tue, 5 May, 16:15–18:00
Display time: Tue, 5 May, 14:00–18:00

EGU26-12087 | ECS | Posters virtual | VPS5

Hyperparameter Sensitivity Analysis of Support Vector Machine for Crop Type Classification Using Sentinel-2 NDVI Time Series 

Fatima Ben zhair, Haytam Elyoussfi, Mouad Alami Machichi, Rahma Azamz, Jada El Kasri, Bouchra Boufous, and Salwa Belaqziz
Tue, 05 May, 14:48–14:51 (CEST)   vPoster spot 2

Support Vector Machine (SVM) classifiers are widely used for satellite-based crop mapping, yet hyperparameter tuning is often treated as a black-box process, with limited insight into how individual parameters influence classification performance. This limitation becomes critical when deploying SVM models across heterogeneous agricultural landscapes, where robustness and transferability are required. This study systematically investigates the sensitivity of SVM hyperparameters for crop type discrimination using Sentinel-2 NDVI time series over the Al Haouz plain in central Morocco, a heterogeneous irrigated agricultural region comprising winter cereals and perennial orchards. An exhaustive grid search was conducted across multiple orders of magnitude for the regularization parameter C (0.01–1000) and the RBF kernel coefficient γ (0.001–10). Model performance was evaluated using F1-score, Recall, and Overall Accuracy for six crop classes with contrasting phenological patterns.

Results reveal a pronounced asymmetry in hyperparameter influence. The regularization parameter C exhibits a high degree of robustness: once a moderate threshold is reached (C ≥ 1), classification performance stabilizes and remains insensitive to further increases. In contrast, γ shows a narrow optimal range (0.1–1.0), beyond which performance rapidly deteriorates. High γ values induce overfitting, particularly among crops with similar seasonal dynamics, as evidenced by persistent confusion between citrus and olive classes. The optimal configuration (C = 1, γ = 1) achieved an F1-score of 0.80 and an Overall Accuracy of 81%. More importantly, sensitivity analysis demonstrates that γ plays a dominant role in model calibration. These findings provide practical guidance for deploying robust SVM classifiers in data-limited agricultural contexts, where extensive hyperparameter tuning is often impractical.

How to cite: Ben zhair, F., Elyoussfi, H., Alami Machichi, M., Azamz, R., El Kasri, J., Boufous, B., and Belaqziz, S.: Hyperparameter Sensitivity Analysis of Support Vector Machine for Crop Type Classification Using Sentinel-2 NDVI Time Series, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12087, https://doi.org/10.5194/egusphere-egu26-12087, 2026.

EGU26-20818 | ECS | Posters virtual | VPS5

Agreement measures for continuous, ratio-scale data: cJaccard, cPrecision, cRecall and cF-score 

Katarzyna Krasnodębska, Wojciech Goch, Johannes H. Uhl, Judith A. Verstegen, and Martino Pesaresi
Tue, 05 May, 14:51–14:54 (CEST)   vPoster spot 2

Continuous, spatially explicit estimates of environmental attributes are increasingly provided as gridded data. The accuracy of gridded data, including classifications derived from remotely-sensed data, is typically evaluated using measures based on confusion matrices with site-specific class allocations; however, these measures are defined for categorical variables and are therefore not applicable to ratio-scale attribute estimates representing quantities, such as canopy height or population abundance.

We present an approach that extends commonly used agreement measures, i.e. the Jaccard index, Precision, Recall, and F-score, to non-negative, continuous ratio-scale attributes. The extended measures (cJaccard, cPrecision, cRecall, and cF-score) are viable equivalents to their binary counterparts, invariant to data imbalance and suitable for evaluating the agreement of various types of data representing ratio-scale attribute estimates. The cJaccard measure has proven useful for a range of applications in the geospatial domain, illustrating the broader potential of these measures for evaluating large-scale environmental gridded data products and beyond.

The aim of this contribution is to showcase and discuss the practical application of these continuous agreement measures to real-world gridded datasets representing spatial-environmental variables. Through applied examples, we demonstrate how cPrecision and cRecall enable a directional interpretation of disagreement, disentangling commission and omission errors in the total proportion of misallocated magnitudes. We further illustrate how cJaccard provides a bounded, scale-independent measure of agreement that complements typically used error-based measures (such as Mean Absolute Error or Root Mean Square Error) in the data comparison process.

How to cite: Krasnodębska, K., Goch, W., Uhl, J. H., Verstegen, J. A., and Pesaresi, M.: Agreement measures for continuous, ratio-scale data: cJaccard, cPrecision, cRecall and cF-score, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20818, https://doi.org/10.5194/egusphere-egu26-20818, 2026.

Login failed. Please check your login data. Lost login?