ESSI1.7 | Strategies and Applications of AI and ML in a Spatiotemporal Context
EDI PICO
Strategies and Applications of AI and ML in a Spatiotemporal Context
Co-organized by GI2
Convener: Jens Klump | Co-conveners: Hanna Meyer, Christopher Kadow, Ge Peng, Jeremy Rohmer
PICO
| Wed, 06 May, 16:15–18:00 (CEST)
 
PICO spot 2
Wed, 16:15
Modern challenges in climate risk management, disaster response, public health, resource management, and logistics demand robust spatiotemporal analysis of increasingly complex geospatial datasets. Recent studies, however, highlight significant challenges when applying ML and AI to spatial and spatio-temporal data along the entire modelling pipeline, including reliable accuracy assessment, model interpretation, transferability, and uncertainty assessment. This gap has been recognised and led to the development of new spatiotemporally aware strategies and methods in response to the promise of improving spatio-temporal predictions, the treatment of the cascade of uncertainties, decision making and facilitating communication.

This session focuses on the strategic integration and application of artificial intelligence (AI) and machine learning (ML) to address these challenges. We welcome contributions that explore novel methods, software tools, and infrastructures designed to improve spatiotemporal predictions, manage cascading uncertainties, and support decision-making. Emphasis will be placed on interpretability, transferability, and reliability across the modelling pipeline, as well as on the communication of results to diverse stakeholders. Case studies, theoretical advances, and cross-disciplinary approaches are encouraged.

PICO: Wed, 6 May, 16:15–18:00 | PICO spot 2

PICO presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears 15 minutes before the time block starts.
16:15–16:20
16:20–16:22
|
PICO2.1
|
EGU26-2950
|
ECS
|
On-site presentation
|
Yu-Yun Hsu, WeiCheng Lo, Jhe-Wei Lee, and Chih-Tsung Huang

Land subsidence has long been a critical environmental hazard along the southwestern coast of Taiwan, with Yunlin County being one of the most severely affected areas. In this study, Long Short-Term Memory (LSTM) neural networks are employed to develop predictive models for land subsidence. Cumulative land subsidence, groundwater-level variations, and lithological layering are considered as input features to investigate the predictive performance of the models from both temporal and spatial perspectives.

As long-term groundwater monitoring data often suffer from missing values, this study further introduces a Cue Wasserstein GAN with Gradient Penalty (CWGAIN-GP) to impute missing groundwater-level data, thereby improving the stability and completeness of subsequent prediction models. Artificial masking experiments, including continuous missing periods ranging from one month to one year and random removal of 10%–50% of the data. The results show that the average Nash–Sutcliffe efficiency (NSE) achieved by the imputation model reaches 0.897.

For temporal prediction, the land subsidence model is trained using different training lengths (one year and seven years) and variable combinations to forecast cumulative land subsidence over the following one to two years. The most recent six months of observations are used as input to predict the monthly land subsidence increment. The results indicate that longer training periods and more comprehensive input variables lead to improved model performance. The coefficient of determination (R²) for the first prediction year reaches 0.945, while for the second year—under conditions of three consecutive months of missing data—the R² remains as high as 0.923.

For spatial prediction, a multi-station training and single-station validation strategy is adopted. When predicting a target station, the three nearest neighboring stations are selected, and their observations from the most recent three months are used as inputs to predict the monthly land subsidence increment at the target station. This increment is then combined with the known cumulative subsidence from the previous month to estimate the current cumulative subsidence. The results show that the average R² for single-month predictions reaches 0.966. Even when cumulative subsidence is estimated iteratively by adding predicted monthly increments over six consecutive months, the average R² remains around 0.90, demonstrating strong spatial generalization capability of the proposed model.

Fig.1 Monthly vertical profiles of cumulative land subsidence at different depths for the Huwei (MW_HWES) station in 2021.

Overall, this study demonstrates that cumulative land subsidence can be effectively predicted by integrating temporally and spatially informed LSTM models with vertically stratified hydrogeological information. Although cumulative subsidence is used as the primary prediction target, the inclusion of groundwater-level variations and lithological layering enables the model to capture the vertical characteristics of aquifer systems and their influence on subsidence processes. The results highlight the importance of incorporating stratified subsurface information when modeling land subsidence and provide a robust framework for spatiotemporal subsidence prediction under realistic data availability constraints.

How to cite: Hsu, Y.-Y., Lo, W., Lee, J.-W., and Huang, C.-T.: Predicting Cumulative Land Subsidence and Its Spatiotemporal Relationship Using Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2950, https://doi.org/10.5194/egusphere-egu26-2950, 2026.

16:22–16:24
|
PICO2.2
|
EGU26-5255
|
On-site presentation
Xiaohui Yu, Linshu Hu, Cheng Su, Yiming Yan, Sensen Wu, and Zhenhong Du

Soil moisture downscaling is a challenging geospatial regression task that requires accurately capturing complex spatiotemporal relationships across scales. In this study, we conduct a preliminary applicability assessment of denoising diffusion probabilistic models (DDPMs) for continuous-value geospatial regression, exploring the potential of generative modeling frameworks for soil moisture downscaling. The model learns the relationships between coarse-resolution soil moisture observations and multi-source auxiliary features, enabling the generation of high-resolution soil moisture estimates.

During training, the model uses 36 km resolution satellite soil moisture data and conditions on auxiliary variables, including normalized difference vegetation index (NDVI), land surface temperature, surface albedo, precipitation, and digital elevation model (DEM). A conditional embedding strategy is introduced to incorporate temporal information, spatial location information, and in-situ statistics into the diffusion network via feature-wise linear modulation (FiLM), enhancing the model’s ability to capture complex spatiotemporal structures while maintaining stability. During inference, a two-stage “generation–correction” pipeline is employed: high-resolution (1 km) auxiliary features are first used to generate initial predictions through the diffusion model, which are subsequently bias-corrected using in-situ station data.

The applicability assessment combines quantitative and qualitative evaluation. Quantitative metrics include unbiased mean squared error (UMSE), root mean square error (RMSE), mean absolute error (MAE), and R², while qualitative evaluation focuses on spatial pattern consistency and temporal trend representation. Experimental results indicate that the diffusion-based generative model produces reasonable, spatially coherent, high-resolution soil moisture results and successfully captures major temporal variations. These findings demonstrate the applicability of generative frameworks for geospatial regression and their potential as a geospatial regression modeling paradigm, providing a foundation for further refinement and evaluation.

How to cite: Yu, X., Hu, L., Su, C., Yan, Y., Wu, S., and Du, Z.: Long-term Soil Moisture Downscaling Based on Diffusion Models: Applicability Assessment of Generative Models for Geospatial Regression Tasks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-5255, https://doi.org/10.5194/egusphere-egu26-5255, 2026.

16:24–16:26
|
PICO2.3
|
EGU26-8056
|
ECS
|
On-site presentation
Corinna Perchtold, Jeremy Rohmer, Augustin Thomas, Julie Lions, and Martin Wieskotten

This study presents a comprehensive sensitivity analysis framework to disentangle the drivers of predictive uncertainty in spatial interpolation and how they ultimately affect spatial predictions. Developed within a Global Sensitivity Analysis context, the proposed approach is model-independent and generic, allowing for broad application across diverse spatial interpolation workflows.

The framework is demonstrated using groundwater Sulfate concentration in the Paris Basin, a dataset characterised by sparse and highly clustered sampling across six distinct aquifers according to the French "BD LISA" hydrogeological system (https://bdlisa.eaufrance.fr/). We represent the underlying spatial process as a  Gaussian Random Field, leveraging Integrated Nested Laplace Approximations for computationally efficient Bayesian inference. This allows for a  probabilistic treatment of uncertainty even within complex spatial structures.

We systematically evaluate the impact of several key uncertainty factors related to both data and model configuration: (1) the number of monitoring stations and their spatial distribution; (2) the selection of the environmental covariates  and the functional form of their effects (linear vs. non-linear); (3) the treatment of censored data (values below detection limits); and (4) structural assumptions regarding the spatial covariance function, specifically the estimation of variogram hyperparameters such as range, sill, and nugget effects and their prior specification. By propagating these uncertainty sources through our framework, we derive domain-wide aggregated sensitivity measures. These metrics quantify how specific data topologies—including sampling density, clustering effects, and censoring rates—govern the stability and accuracy of the resulting spatial interpolations.

Finally, the results facilitate an in-depth discussion on the limitations of purely probabilistic methods in data-poor scenarios. We provide an outlook on the potential of extra-probabilistic approaches, such as imprecise or interval-based kriging, to more robustly address the wide range of epistemic uncertainties inherent in environmental monitoring.

We acknowledge financial support of the French National Research Agency within the HOUSES project (grant N°ANR-22-CE56-0006).

How to cite: Perchtold, C., Rohmer, J., Thomas, A., Lions, J., and Wieskotten, M.: Global Sensitivity Analysis of Spatial Interpolation for Sparse, Clustered, and Censored Data: A Case Study of Groundwater Sulfate in the Paris Basin, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8056, https://doi.org/10.5194/egusphere-egu26-8056, 2026.

16:26–16:28
|
PICO2.4
|
EGU26-8402
|
ECS
|
On-site presentation
Chih-Chi Wang and Peng Luo

Predicting building attributes—such as functional classification, socioeconomic status, and energy efficiency—is a fundamental task in urban science. The current paradigm involves leveraging domain knowledge to extract attribute-specific morphological or topological features for supervised modeling. However, this heavy reliance on manual feature engineering often leads to task-specific models where features must be redefined for each attribute. Consequently, the field lacks a unified, generalizable framework capable of multi-attribute building prediction.

Inspired by recent advances in Regression Language Models (RLMs), which cast continuous prediction as a text-to-text task, we propose Buildings as Text (BaT). BaT serializes structured building representations (e.g., GeoJSON) into raw text and enables end-to-end text-to-text regression. To mitigate the spatial sensitivity of building data, we introduce a Topology-Preserved Coordinate (TPC) strategy that removes each building text’s absolute positional information. Specifically, TPC applies a global coordinate shift to the serialized geometry, suppressing absolute-location bias while preserving local shape and topology. By operating directly on raw text, BaT eliminates manual feature engineering and allows the model to learn a “spatial syntax” from the underlying geometric descriptions.

We validated the BaT framework through a case study on informal settlement (slum) classification. The results demonstrate that our model achieves superior performance and higher adaptability compared to traditional morphology-based methods. While validated on slum detection, this research offers a universal and scalable paradigm for urban building analysis, suggesting that Large Language Models can effectively "read" urban forms for diverse prediction tasks beyond specific domains.

How to cite: Wang, C.-C. and Luo, P.: Buildings as Text: A Universal Regression Paradigm for Building Attribute Prediction, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8402, https://doi.org/10.5194/egusphere-egu26-8402, 2026.

16:28–16:30
|
PICO2.5
|
EGU26-11193
|
ECS
|
On-site presentation
Ana Sofia Meneses Pineda, Marco Solinas, Marco Ramazzotti, Massimo Musacchio, and Maria Fabrizia Buongiorno

The archaeological landscapes of northern Oman host thousands of funerary monuments of different periods and morphologies, forming one of the densest and least explored burial regions of the Arabian Peninsula. Within the framework of LAA&AAS (Laboratorio di Archeologia Analitica e Sistemi Artificiali Adattivi) and MASPAG (Missione Archeologica della Sapienza nella Penisola Arabica e nel Golfo), a multidisciplinary project supported by Sapienza University of Rome and the Italian Ministry of Foreign Affairs, we developed a reproducible geo-AI workflow to classify and analyse funerary structures based on remote-sensing and spatial-context information.

The first dataset, encompassing 185 tombs mapped in the Southwestern Cemetery near the village of Muslimat, in the region of Wadi al-Maʿawil (ca. 70 Km southwest of Muscat) was used to test a machine-learning pipeline designed to discriminate between morphological classes (“tombs” vs “non-tombs”, and within-type subclasses) from high-resolution satellite imagery and derived spatial metrics. Two Random Forest models were compared: a geometry-only baseline using shape descriptors (area, compactness, circularity, elongation), and an extended model incorporating spatial-context features such as kernel density, nearest-neighbour distances, Moran’s I local autocorrelation and cluster membership. The integration of these contextual descriptors increased overall accuracy from 59 % to 76 %, improving model reliability and reducing false positives in morphologically ambiguous contexts. The workflow includes systematic feature importance analysis and confusion-matrix evaluation to assess interpretability and class-imbalance effects.

Beyond the single-site test case, this approach aims to address a broader spatiotemporal challenge: learning and transferring morphological–contextual patterns across different archaeological regions. During 2025 field campaign (20 October – 20 December 2025), more than 500 new tombs were surveyed and georeferenced in the area of the Western Cemetery, expanding the available dataset and enabling large-scale testing of model scalability and transferability. This new phase will assess whether models trained in Wadi al-Maʿawil can generalize to nearby valleys with comparable geomorphological and cultural settings, supporting semi-automated mapping and predictive modelling of funerary features.

The presented pipeline, implemented in an open-source environment (Python, QGIS, and scikit-learn), is designed for reproducibility and transparent parameter tracking. All processing steps—from data preparation and feature extraction to model training and evaluation—are logged and versioned, facilitating cross-project reuse. The workflow thus bridges archaeological and geospatial domains, demonstrating how spatially aware machine learning can improve the detection, classification, and interpretation of complex cultural landscapes.

This contribution highlights the potential of AI and ML in managing spatiotemporal archaeological data and in advancing reproducible analytical frameworks. The methodological approach developed for the Omani funerary landscapes can be generalized to other MASPAG regions, supporting comparative analysis of desert landscapes and long-term dynamics of human–environment interaction across the Arabian Peninsula.

How to cite: Meneses Pineda, A. S., Solinas, M., Ramazzotti, M., Musacchio, M., and Buongiorno, M. F.: A preliminary study of the morphology and spatial distribution of funerary elements in Oman, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11193, https://doi.org/10.5194/egusphere-egu26-11193, 2026.

16:30–16:32
|
PICO2.6
|
EGU26-11342
|
On-site presentation
Jakub Nowosad, Hanna Meyer, and Jonas Schmidinger

Understanding the spatial dependence of residuals is important for interpreting and diagnosing spatial machine learning models. Spatial autocorrelation in the residuals suggests that the model has not fully captured the data's spatial structure. This may imply that the model is missing crucial spatial context or interactions, and that, in effect, it is spatially biased, leading to underestimation in some areas and overestimation in others.

Moran's I is a commonly used statistic for the diagnosis of spatial autocorrelation in spatial predictions, providing a single-value quantitative measure with a straightforward interpretation. This measure quantifies the degree of spatial autocorrelation, indicating whether similar values are clustered together or dispersed across space. The information provided by Moran's I has been used in various ways in studies applying machine learning: to evaluate model performance, interpret results, understand model limitations, and compare different modeling approaches.

Unlike standard model performance metrics, such as R2 or RMSE, Moran's I depends not only on the values of residuals but also on the spatial contextespecially the study area's extent, the sampling strategy used, and the specification of spatial weights. However, there is a lack of a comprehensive understanding of how these factors influence the results of Moran's I calculation in the context of spatial machine learning, and of how to best use this measure for model evaluation and comparison.

Using simulated data with controlled spatial properties, we investigated how testing set size, sampling strategy, and the specification of spatial weights influence Moran's I computed on model residuals. Our results show that Moran's I, calculated based on k-nearest neighbors approach,  primarily reflects the spatial structure of values in the testing set rather than the residual autocorrelation across the full prediction domain, often underestimating fine-scale spatial patterns. These findings have various implications: weight-matrix definitions must be clearly reported, calculations on sparsely distributed or clustered samples should be avoided, Moran's I is generally not directly comparable across studies due to differences in spatial extents and sampling, and its values are inherently scale-dependent.

With this contribution, we aim to present the behavior of Moran's I calculated from residuals of spatial machine learning models under different conditions, outline best practices for selecting and reporting spatial weights, and discuss how to interpret Moran’s I. 

 

 

How to cite: Nowosad, J., Meyer, H., and Schmidinger, J.: Using Moran's I for assessing residual spatial autocorrelation in machine learning models , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-11342, https://doi.org/10.5194/egusphere-egu26-11342, 2026.

16:32–16:34
|
PICO2.7
|
EGU26-12275
|
ECS
|
On-site presentation
Darius A. Görgen, Simon Heilig, Lara Meyn-Grünhagen, Asja Fischer, Johannes Lederer, and Hanna Meyer

Machine learning methods are used ubiquitously within the Earth Sciences to model spatio-temporal phenomena. These methods scale very well to big data sets and are used to model complex non-linear relationships between the predictor and outcome variables. Yet, most methods might silently fail when used in extrapolation scenarios, e.g. when combinations of predictor variables are encountered that have not been seen during training. This might be the case when the model is applied to new geographic areas that differ from the areas the model was trained on. For traditional machine learning models, estimating the area of applicability based on distances in the predictor space has been proposed. New inputs with distances above a certain threshold are rejected from prediction since our confidence in the model's output is low and we do not expect the estimated performance to hold.

Inspired by the success of deep architectures in the field of computer vision, the use of deep neural networks has been steadily increasing, especially in Earth Observation. Translating the concept of the area of applicability to deep architectures, however, remains a open research challenge. For the safe deployment of such models in the real world it is required to flag inputs for which we expect the model to extrapolate and is thus operating outside the estimated performance measure.

In this work, we are extending the concept of the area of applicability to deep neural network architectures. As an application rooted in current practices for Earth Observation, we use networks trained end-to-end for scene classification. We use these models as feature extractors to obtain representations of input samples in embedding space. We derive the area of applicability of the model within this space based on distances between training and calibration samples. For this purpose, we test different distance measures (euclidean, mahalanobis), leveraging the concept of KNN-distances, which also takes local point densities into account and test whether principal components of the embeddings improve the delineation of the area of applicability.

Our results highlight practical relevant trade-offs between different distance metrics operating in high-dimensional embedding spaces to derive the area of applicability for deep neural networks. The methodology presented can serve as a baseline to ensure the reliability of deployed models in safety critical applications.

How to cite: Görgen, D. A., Heilig, S., Meyn-Grünhagen, L., Fischer, A., Lederer, J., and Meyer, H.: Area of Applicability for Deep Learning: Exploring Latent Space Geometry of Earth Observation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12275, https://doi.org/10.5194/egusphere-egu26-12275, 2026.

16:34–16:36
|
EGU26-16001
|
ECS
|
Virtual presentation
Enhancing Near-Real-Time Forest Monitoring: Foundation Models and Harmonized Landsat-Sentinel (HLS) Time Series for Selective Logging Detection
(withdrawn)
Evandro Taquary and Luiz Aragão
16:36–16:38
|
PICO2.9
|
EGU26-19025
|
ECS
|
On-site presentation
Florencio Campomanes V, Monika Kuffer, Alfred Stein, Anne M. Dijkstra, Lorraine Trento Oliveira, and Mariana Belgiu

The integration of Earth Observation (EO) data with machine learning (ML) has transformed the mapping of Deprived Urban Areas (DUA). Despite these technical advances, persistent disconnect remains between research outputs and their operational uptake by local stakeholders. In parallel, advances in ML and deep learning (DL), together with new satellite missions have improved the extraction of building footprints and urban morphology. Nevertheless, DUA mapping studies, which largely depend on these physical indicators, often prioritize benchmark performance over the robustness, transparency, or usability required in real-world decision-making contexts. One of the main reasons for this gap is spatial data quality (SDQ), which fundamentally limits model performance and generalization. When data quality is poor, due to inaccuracies, incompleteness, or inadequate provenance, models become unreliable, regardless of architectural complexity. Furthermore, many studies rely on validation strategies that ignore spatial autocorrelation, thereby yielding overoptimistic accuracy estimates that mask poor generalization to new local contexts.

To address these challenges, this paper argues for a shift toward a systematic assessment of spatial data quality. We first conduct a scoping review of 50 state-of-the-art DUA mapping studies published between 2017 and 2025. Our analysis reveals a high dependence on very-high-resolution imagery (72%), a widespread lack of publicly accessible data and code (92%), and a critical deficiency in operationalizing semantic definitions of DUAs with 90% of studies failing to provide mapping rules (for visual interpretation) or ground rules (for in-situ collection). Most studies also fail to assess user needs (90%) or do not consider the ethical implications of using DUA data (88%), which is highly sensitive due to risks such as forced evictions. Building on these findings and established international standards from ISO and the OGC, we propose a comprehensive Spatial Data Quality (SDQ) framework tailored to transparently document supervised image classification in DUA mapping. This framework integrates established practices such as adherence to the Findable, Accessible, Interoperable, Reusable (FAIR) principles and assessment of acquisition, measurement and spatial-temporal quality with novel dimensions addressing semantic consistency, sampling representativeness, human factors in annotation, learning shortcut risk, user needs validity, ethical considerations, and transparent reporting of the dataset’s potential failure modes or uncertainties. By operationalizing SDQ as a living, extensible framework, this work aims to better align advances in ML and DL with sustained societal impact, ensuring that DUA mapping products, or any relevant application domain, are fit for use by local communities and decision-makers.

How to cite: Campomanes V, F., Kuffer, M., Stein, A., Dijkstra, A. M., Trento Oliveira, L., and Belgiu, M.: A framework for assessing the quality of spatial data applied in supervised image classification of deprived urban areas, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19025, https://doi.org/10.5194/egusphere-egu26-19025, 2026.

16:38–16:40
|
PICO2.10
|
EGU26-19057
|
ECS
|
On-site presentation
Said El hachemy, Chaima Aglagal, Hamza Ait-Ichou, Ilham Elhaid, Jawad Zlaiga, Mohammed Hssaisoune, Lhoussaine Bouchaou, and Salwa Belaqziz

Greenhouse agriculture has become a crucial element of agricultural practices in Morocco, yet its spatial and temporal evolution remain insufficiently quantified. This study aims to map greenhouse structures at the Souss-Massa region scale in order to assess the progress of covered agriculture and examine its relationship with socio-economic development in Morocco. Using hand-annotated greenhouse data from the Chtouka region as ground truth, we develop a deep learning–based detection framework relying exclusively on open-source tools. Multispectral Sentinel-2 satellite imagery at 10 m spatial resolution is used as input to a U-Net convolutional neural network, which is trained, validated, and tested for greenhouse segmentation. The proposed model achieves an overall accuracy of up to 94%, demonstrating strong generalization capability. The resulting plug-and-play methodology enables scalable, cost-effective, and open-source greenhouse mapping, and provides valuable insights into the dynamics of covered agriculture and its role in Morocco’s agricultural and socio-economic development.

How to cite: El hachemy, S., Aglagal, C., Ait-Ichou, H., Elhaid, I., Zlaiga, J., Hssaisoune, M., Bouchaou, L., and Belaqziz, S.: Satellite imagery for greenhouse mapping in Morocco using U-net model, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19057, https://doi.org/10.5194/egusphere-egu26-19057, 2026.

16:40–16:42
|
PICO2.11
|
EGU26-19452
|
ECS
|
On-site presentation
Negar Siabi, Rackhun Son, Maik Thomas, Christopher Irrgang, and Jan Saynisch Wagner

Accurate forecasting of vector-borne diseases such as dengue is often challenged by limited and noisy spatiotemporal data. This study evaluates the effectiveness of data augmentation techniques in enhancing the robustness and predictive accuracy of machine learning models. We assess multiple augmentation strategies applied to weekly dengue case data across countries in South and Central America (2014–2022). Results show that augmentation substantially improves short-term forecasting performance, particularly in regions with sparse or irregular observations, yielding higher R² values and lower relative errors compared to non-augmented baselines. These findings demonstrate that well‑designed augmentation can mitigate data scarcity and strengthen the generalization of graph‑based deep learning frameworks for epidemiological forecasting. Overall, the study highlights augmentation as a practical and scalable approach for improving spatiotemporal ML applications in disease surveillance.

How to cite: Siabi, N., Son, R., Thomas, M., Irrgang, C., and Saynisch Wagner, J.: Improving Dengue Forecasting with Spatiotemporal Data Augmentation and Machine Learning, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19452, https://doi.org/10.5194/egusphere-egu26-19452, 2026.

16:42–16:44
|
PICO2.12
|
EGU26-19669
|
ECS
|
On-site presentation
João Gabriel Vinholi, Rim Sleimi, Florian Werner, and Albert Abelló

At continental scale, crop classification needs models that capture phenology through temporal analysis without degrading field boundaries. We introduce a decoupled architecture that uses static foundation‑model features across multi‑sensor time series and fuses them with high‑resolution spatial features. The temporal stream ingests paired multispectral and SAR sequences plus a static DEM and metadata, extracts foundation model token features per timestep, and compresses them with a Perceiver‑style bottleneck that cross attends from a fixed latent bank to the full foundation model token volume. Such heavy compression collapses sequence length by orders of magnitude, which makes longer temporal windows and larger batches ingestible on consumer‑grade GPU memory constraints while preserving the temporal signatures needed to separate crops with similar single‑date appearance.
The spatial stream stays purely static --- it selects a single high‑quality multispectral reference frame and passes it through a high‑resolution backbone to retain fine geometry and crisp boundaries. The two streams are joined in a query‑based decoder, where dynamic queries generated from the compressed temporal latents attend to multi‑scale spatial features, aligning phenological signatures with precise field edges. This fusion mechanism prevents coarse temporal features from blurring geometry and makes delineation robust to shifts in timing or crop management practice. In fact, temporal queries encode crop‑specific growth signatures, while the spatial stream supplies the pixel‑level evidence for boundary localization, whereas the decoder enforces instance‑aware segmentation through iterative cross‑attention and masked refinement.
We evaluate on EuroCrops crop‑class labels, achieving a Micro Recall of 84.1% and a Segmentation Quality of 84.2%. Transferability is tested with a spatial holdout protocol using geographically disjoint train/test regions, reliability is summarized by aggregate metrics on these strict splits, and uncertainty is communicated through per‑class performance variability and label‑noise sensitivity analyses that bound achievable scores.

How to cite: Vinholi, J. G., Sleimi, R., Werner, F., and Abelló, A.: A Two‑Stream Spatiotemporal Architecture with Foundation‑Model Features Applied to Crop Classification, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19669, https://doi.org/10.5194/egusphere-egu26-19669, 2026.

16:44–16:46
|
PICO2.13
|
EGU26-20221
|
ECS
|
On-site presentation
James Okemwa Ondieki, Matthes Rieke, and Simon Jirka

Spatial Data Infrastructures (SDIs) contain a lot of spatial data from various organizations and data producers. Metadata is intended to enable the discovery of the data, yet finding the relevant data can be challenging. The challenges include rigid keyword-search, complex search interfaces in geoportals, map-based search that require some geographic knowledge, and language differences between user queries and the metadata.

The development of Large Language Models (LLMs) offers new opportunities to improve spatial data discovery. LLMs demonstrate strong language understanding and generation capabilities and have been used in information retrieval tasks. They can overcome semantic differences and language barriers between user queries and the needed information. However, their internal knowledge is limited and they are prone to hallucinations. Unless the datasets in SDIs, or the web pages describing them are indexed by search engines, LLMs with internet search tools cannot find them. 

Retrieval-Augmented Generation (RAG) offers a solution for the knowledge limitations, by connecting an LLM with an external and up-to-date knowledge base. However, RAG mainly works in the textual domain and excels at retrieving external information that is semantically relevant to a user query. Queries for geographic data have a spatial aspect yet the spatial reasoning capabilities of LLMs are limited. For a query like “forest data for Vienna”, RAG can identify the relevant forest data from a pool of metadata, regardless of the language or words used to describe the data. However, identifying datasets that meet the spatial intent is a problem. DCAT metadata, the most popular metadata standard, defines the spatial extent of spatial datasets using bounding box coordinates or as links to gazetteers. Naive RAG is based on semantic similarity approaches.  An LLM can identify “Vienna” as a location, but would struggle to identify datasets relevant to the location, as there is little semantic similarity between the location name and coordinate digits or gazetteer links.  There is thus a need to incorporate spatial indexing techniques for improved spatial reasoning.

With our contribution we present an approach that combines LLMs, RAG, and spatial indexing techniques to overcome existing challenges in discovering spatial data in SDIs, and improve spatial data discovery through natural language queries.

How to cite: Ondieki, J. O., Rieke, M., and Jirka, S.: Using Large Language Models to Enhance Spatial Data Discovery in Spatial Data Infrastructures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20221, https://doi.org/10.5194/egusphere-egu26-20221, 2026.

16:46–16:48
|
PICO2.14
|
EGU26-20813
|
ECS
|
On-site presentation
Yesong Wei

The recent rise of foundation models in Earth Observation (EO) has reshaped how remote sensing tasks are approached, particularly by allowing strong downstream performance with comparatively limited labeled data. These models have reported impressive results in applications such as land cover classification and semantic segmentation. However, performance gains alone do not resolve a central concern: whether the resulting predictions can be trusted. In practical EO scenarios—including disaster response and environmental monitoring—miscalibrated confidence estimates may lead to incorrect decisions even when overall accuracy appears high.

Motivated by this gap between accuracy and reliability, this study focuses on the uncertainty calibration behaviour of fine-tuned EO foundation models. Using TorchGeo for consistent data handling and the Lightning-UQ-Box framework for uncertainty quantification, we construct an evaluation pipeline that contrasts Vision Transformer–based pretrained models with conventional convolutional neural networks trained from scratch. Experiments are conducted across both image classification tasks (e.g., EuroSAT) and dense prediction settings such as semantic segmentation.

Rather than assuming superior representations automatically yield better-calibrated predictions, we explicitly examine how calibration properties change after fine-tuning large pretrained models. In addition, we evaluate a spectrum of uncertainty quantification approaches, from lightweight post-hoc methods like temperature scaling to more computationally demanding techniques, including Monte Carlo Dropout, deep ensembles, and Laplace approximation. Calibration quality is assessed using expected calibration error and reliability diagrams, alongside predictive accuracy.

By analysing the trade-offs between computational cost, accuracy, and calibration, this work provides practical insight into which UQ strategies are most effective for EO foundation models. Our findings aim to support the deployment of remote sensing systems in operational settings where reliable uncertainty estimates are as critical as raw predictive performance.

How to cite: Wei, Y.: Uncertainty Quantification for Earth Observation Foundation Models, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20813, https://doi.org/10.5194/egusphere-egu26-20813, 2026.

16:48–16:50
|
PICO2.15
|
EGU26-21002
|
ECS
|
On-site presentation
Loc Nguyen and David Daou

Localized rapid urban expansion and global climate change have contributed to land use and land cover (LULC) dynamic modifications, which further links to changed land surface temperatures (LST). This study proposes an integrated approach of machine learning (ML) models in assessing decadal LULC changes and future prediction in a city in the Mekong region. To achieve an accurate LULC map object-based classification strategies were implemented using various ML techniques across observed years with four main land cover categories: built-up areas, water bodies, paddy fields/shrubs, and orchards, together with LST extraction. The findings reveal that Random Forest classifier works superior to other classifiers, achieving the best overall accuracy of 81%. There have been substantial land usage changes, with the percentage of developed areas rising from 8% in 2014 to approximately 12% in 2024. Urbanization is correlated with rising temperatures , while, vegetation, on the other hand, helps alleviate this heat by providing shade and cooling. With an overall accuracy of 85% in the patch-generating land use simulation (PLUS) model, by 2030, under the impacts of both natural and socio-economic drivers, an apparent increase in the proportion of built-up areas to 15% and a slight variation in other categories could be seen in line with planning objectives. The urban expansion could be clearly seen in the highly dense districts with an increase to 42% by 2030 from an initial stage of merely 27% in 2014. The primary forecast conversions in LULC observed were vegetated lands transforming into construction areas for urbanization, yet maintaining agricultural practices for food security. The integrated approach has proven its suitability in intricate land usage patterns evaluation and optimization.

How to cite: Nguyen, L. and Daou, D.: Harnessing an Integrated Machine Learning based Approach in Monitoring and Predicting Dynamic Spatiotemporal Land Use and Land Cover Changes. A case study in a Mekong city , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21002, https://doi.org/10.5194/egusphere-egu26-21002, 2026.

16:50–16:52
|
PICO2.16
|
EGU26-21265
|
On-site presentation
Polina Tregubova, Sylvie Clappe, Ida Marielle Mienna, Bruno Smets, Marcel Buchhorn, Ruben Remelgado, and Carsten Meyer

Ecosystems are a key component of biodiversity, providing vital services to humans and the economy. Anthropogenic pressures driving environmental change result in widespread ecosystem degradation and loss. The area and spatial distribution of ecosystem types, referred to as ecosystem extent, provide a critical entry point for assessing ecosystem condition, functioning, and associated services, and therefore require detailed and spatially explicit monitoring.

Despite advances in geospatial analysis, consistent mapping and delineation of ecosystem extent remain challenging. Map products on ecosystem extent should, therefore, be supported by uncertainty assessments, ideally in a spatially explicit manner. According to best practices in related fields, the minimum requirement for uncertainty quantification for thematic maps is the aggregated estimation of per-class accuracy and per-class area uncertainty, following a validation procedure based on independent reference data. However, the standard practice remains spatially implicit. To date, there is no established practice for spatially explicit uncertainty quantification procedures.

This study presents a spatial solution for estimating the uncertainty of maps produced using machine-learning algorithms. The approach builds on the standard map-validation procedure and extends it to pixel-wise assessments using conformal prediction. While conformal prediction can be applied to any machine learning algorithm, ecosystem extent mapping poses domain-specific challenges, including a high-dimensional multi-class setting and hierarchical class structures. This study, therefore, focuses on developing solutions to ensure robust class-specific coverage, exploring different conformal prediction implementation variants, and adapting them from flat to hierarchical mapping scenarios.

To assess the feasibility and applicability of our approach, we tested it on the Oslo-Viken municipality in Norway. In this case study, we developed an ecosystem extent map for 2024 and quantified and mapped its uncertainty at pixel-level. This analysis helped to evaluate the practical application and performance of the approach on real-world cases.

 

How to cite: Tregubova, P., Clappe, S., Marielle Mienna, I., Smets, B., Buchhorn, M., Remelgado, R., and Meyer, C.: Spatially-explicit uncertainty assessment of ecosystem extent mapping, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21265, https://doi.org/10.5194/egusphere-egu26-21265, 2026.

16:52–18:00
Login failed. Please check your login data. Lost login?