ESSI3.1 | From FAIR Data to Collaborative Science: Research Data Infrastructures and Virtual Research Environments for Earth System Science
EDI
From FAIR Data to Collaborative Science: Research Data Infrastructures and Virtual Research Environments for Earth System Science
Co-sponsored by AGU and JpGU
Convener: Alessandro Rizzo | Co-conveners: Vasco Mantas, Kirsten Elger, Maria-Luisa Chiusano, Marie JosséECSECS, Heinrich Widmann, Jérôme Détoc
Orals
| Thu, 07 May, 08:30–10:15 (CEST)
 
Room -2.33
Posters on site
| Attendance Fri, 08 May, 16:15–18:00 (CEST) | Display Fri, 08 May, 14:00–18:00
 
Hall X4
Posters virtual
| Mon, 04 May, 14:06–15:45 (CEST)
 
vPoster spot 1b, Mon, 04 May, 16:15–18:00 (CEST)
 
vPoster Discussion, Mon, 04 May, 14:06–15:45 (CEST)
 
vPoster spot 1b, Mon, 04 May, 16:15–18:00 (CEST)
 
vPoster Discussion
Orals |
Thu, 08:30
Fri, 16:15
Mon, 14:06
Addressing global environmental and societal challenges—ranging from climate change, natural hazards, and biodiversity loss—requires interdisciplinary Earth System Science based on transparent, reproducible, and collaborative research. The rapidly growing volume and diversity of data, coupled with increasing demands for interoperability and societal relevance, the need for robust and user-oriented Research Data Infrastructures (RDIs) and Virtual Research Environments (VREs) as essential components of modern Earth system research is more pressing than ever.

This session explores how data infrastructures and platforms can enhance interdisciplinary and transdisciplinary research by integrating perspectives from Open Science, the FAIR and CARE data principles, sustainable software development, and virtual research environments. Our session is focused on bridging the gap between user needs and sustainable, interoperable solutions by combining technical innovation with cultural change, stakeholder engagement, and capacity building. Scientific unions, research infrastructures, and international frameworks play a pivotal role in facilitating this transformation incentivising open and collaborative research practices.

We invite contributions that demonstrate practical and scalable solutions for integrating, discovering, analysing, and reusing heterogeneous Earth system data across disciplines and scales. We seek contributions showcasing operational platforms, standards, and concrete use cases that turn open data into actionable knowledge.
Topics include user-driven research infrastructures and virtual research environments (VREs); semantic technologies, ontologies, and machine-actionable metadata for interoperability; cross-domain data fusion and stakeholder engagement; sustainable, reusable software components; and robust operational and sustainability models for data centres and infrastructures. We particularly encourage contributions addressing training, documentation, and co-design, as well as innovative approaches to FAIR data practices, collaboration, public engagement, and citizen science.

By highlighting success stories, lessons learned, and mature tools—from metadata and standards to fully operational platforms—this session aims to accelerate the shift from open to truly collaborative science and empower the next generation of Earth and environmental data scientists.

Orals: Thu, 7 May, 08:30–10:15 | Room -2.33

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Alessandro Rizzo, Kirsten Elger
08:30–08:35
Open Science Platforms
08:35–08:45
|
EGU26-18847
|
On-site presentation
Lesley Wyborn, Rebecca Farrington, Tim Rawling, Angus Nixon, Bryant Ware, Jo Croucher, Hannes Hollmann, Nigel Rees, Andrew Robinson, Jens Klump, Alex Hunt, and Sara Polanco

Open Science mandates set a high bar for reproducibility and transparency, requiring open knowledge of all input/output sample and data artefacts. It also requires identification of any actor and tools used to process and model these along the full Research Workflow, starting from acquisition of the Primary Observation Datasets (PODs) and samples, through initial data calibration and then subsequent generations of subsamples and digital outputs. At the same time, compliance with the FAIR and CARE principles, and demands for AI-Ready and Decision-Ready data are also ubiquitous: compliance with all these across multiple levels of processing adds additional complexity to the Open Science Paradigm.

AuScope is Australia’s national geoscience Research Infrastructure (RI) funded through the National Collaborative Research Infrastructure Strategy (NCRIS). AuScope facilities enable the collection of multiple data types, ranging from drone, geophysical and satellite data collections that can be TBs in volume, down to small-scale (MB) long tail collections in geochemistry and geochronology. As articulated in the AuScope Research Data Systems Strategy 2025-2030 (https://doi.org/10.5281/zenodo.15825498), AuScope is committed to Open Science and ensuring compliance with FAIR and CARE. 

Using examples from two AuScope Opportunity Fund projects in geophysics and geochemistry, this paper demonstrates how AuScope is developing a Blueprint for a methodology for Open Science that also establishes compliance with FAIR, CARE, AI-Ready and Decision-Ready requirements at each level of processing. Where possible, all research input and output artifacts are Findable, Accessible, Interoperable and Reusable by machines and CARE is being implemented to complement FAIR and document Indigenous interests and governance. It is planned for AI-Ready data to build on FAIR and CARE with additional metadata on quality, documentation, access and preparation, whilst Decision-Ready guidelines will be implemented through a chain-of-custody approach that allows tracking of any activity, any actor involved and any transformation undertaken.

The first stage in the Blueprint is that for each data type, definitions are agreed for the various levels of processing, starting with PODs through to derivative data products, models and visualisations. Where possible, the NASA processing levels are followed: however, more specific definitions have been created for geochemistry, hyperspectral and magnetotelluric data, with additional definitions planned for other data types. 

Identifiers such as ORCIDs, RORs, RAiD, IGSNs and DOIs are essential at each level of processing to uniquely identify the contributing researchers, research infrastructures, funders, software developers, software etc., and allow connections across each successive level of processing. Identifiers will 1) enhance ways for credit to be given to researchers and funders at each processing level and 2) ensure Indigenous metadata are recorded for each POD and then carried downstream to any derivative product. 

Preliminary results from these AuScope Opportunity Fund projects show that although implementing transparent Open Science is complex, already in geophysics it is allowing researchers to access data at the processing levels most suitable to their research objectives. Ultimately, it is hoped that the transparency enabled in each processing level can contribute to greater trust in solutions proposed for global environmental and social challenges as outlined in the UN Sustainable Development Goals.

How to cite: Wyborn, L., Farrington, R., Rawling, T., Nixon, A., Ware, B., Croucher, J., Hollmann, H., Rees, N., Robinson, A., Klump, J., Hunt, A., and Polanco, S.: Developing a Blueprint for Open Science in the AuScope Research Infrastructure by Enabling Vertical Integration Across Multiple Levels of Processing., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18847, https://doi.org/10.5194/egusphere-egu26-18847, 2026.

08:45–08:55
|
EGU26-16308
|
On-site presentation
Luca Guerrieri, Maria Grazia Badas, Pietro Battistoni, Valentina Campo, Carlo Cipolloni, Maria Pia Congi, Chiara D'Ambrogi, Claudia Delfini, Claudio De Luca, Barbara Dessì, Fausto Ferraccioli, Fiorenzo Fumanti, Marco Gerardi, Maurizio Guerra, Gabriele Leoni, Alessandro Maria Michetti, Roberto Passaquieti, Marzia Rizzo, Alessandro Trigila, and Roberta Vigni

In Italy, geological tasks such as the monitoring of geological hazards and the management of georesources are entrusted to the Regional Geological Surveys, each responsible for its own territory. In order to improve and strengthen coordination among the various Regional Geological Services in these activities, the Italian Network of Regional Geological Services (RISG) was established, coordinated by ISPRA – the Geological Survey of Italy.

Within the framework of these activities, the Regional Geological Surveys have highlighted the need for a research infrastructure designed to bridge the gap between academic institutions and operational bodies, in terms of data, services, tools and transfer of knowledge.

Funded by the Italian Ministry of Research through the Recovery Funds programme, GeoSciences IR has been built with this goal by a partnership composed by 13 Universities and 3 research institutions, coordinated by ISPRA: twelve priority specific themes in the geological domain were identified from geological mapping, 3D modelling, marine geology, geoheritage, landslides, sinkholes, design of structural works for risk mitigation, satellite monitoring, active tectonics, georesources and land consumption.

The GeoSciences IR data infrastructure is now open, allowing to access to more than 300 products, including datasets, services, customized viewers, tools, vocabularies, documents, open API and training modules. These latter are available on an e-learning platform built within the research infrastructure.

All these products have been realized based on needs and expectations of Regional Geological Surveys, identified as target users of the infrastructure. The periodic collection of their feedback during the three-years long implementation phase, has allowed to release products well aligned with the requested needs. This shared pathway between the partnership and the target users will extend across the ten-year operational phase of the infrastructure, with a focus on maintaining the data infrastructure and updating the products.

How to cite: Guerrieri, L., Badas, M. G., Battistoni, P., Campo, V., Cipolloni, C., Congi, M. P., D'Ambrogi, C., Delfini, C., De Luca, C., Dessì, B., Ferraccioli, F., Fumanti, F., Gerardi, M., Guerra, M., Leoni, G., Michetti, A. M., Passaquieti, R., Rizzo, M., Trigila, A., and Vigni, R.: GeoSciences IR: a geological data infrastructure for the Italian Network of Regional Geological Surveys, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16308, https://doi.org/10.5194/egusphere-egu26-16308, 2026.

08:55–09:05
|
EGU26-20419
|
On-site presentation
Sylvain Grellet, Hervé Squividant, Mario Adam, Fanny Arnaud, Isabelle Braud, Hélène Bressan, Stéphane Debard, Jérôme Fozzani, Véronique Chaffard, Charly Coussot, Kim Anh Trinh, Yvan Le Bras, Eric Lecaudé, Kenneth Maussang, Frédéric Moine, Stéphane Ollagnier, Anne Puissant, Joël Sudre, Lucas Valarcher, and Alexia Vourch and the other OneWater Data project members and associated Water4All partners

In France, the national research and innovation program ‘OneWater - Eau Bien Commun’ (2022-2032) addresses a wide range of key scientific issues to help protect and manage water as a common good. Made up of several projects that will generate a large amount of highly heterogeneous data, ranging from sensor data to social science data, samples and model-based data. It will also draw on data describing the state of water resources produced by observatories, living labs, research infrastructures and national public monitoring services.

Not all data is yet available according to FAIR principles. To process, share and re-use these heterogeneous datasets and ultimately generate new knowledge, the ‘OneWater FAIR Water Platform’ ambitions to go beyond a simple data catalog by fostering a FAIR Water ecosystem based on international standards and implementing semantic web interoperability producing FAIR compliant data by DNA.

The OneWater FAIR platform is fully integrated in the national research data ecosystem on earth and its environment, which relies on the DataTerra Research Infrastructure and its data hubs such as Theia/OZCAR. Collaboration with the services supporting national public policy data and associated monitoring networks is being organized. At the international level, connection with the community is established so that the OneWater initiative can contribute to and benefit from the FAIR Water community. This includes the OGC Hydrology Domain Working Group (OGC HydroDWG), WMO, UN bodies (UNEP, UNESCO IGRAC), DANUBIUS, eLTER RIs, TERENO and the Water4all partnership amongst others.

The OneWater data project does not only address data and information technology needs but is also committed to supporting the water community through an ecosystem building on open international standards, their open-source implementation and resource people. It also builds on a national overview of water data use practices, tools and obstacles for both researchers and operational stakeholders to reach FAIR. This will allow to train and support the community so that the tools traditionally used evolve towards FAIR practices.

This communication will present the approach being implemented in the OneWater Data Platform project and its first results.

1/ The definition of how to reach a high FAIRness level within the water community in the light of the existing international standards and best practices (OGC, W3C, INSPIRE, RDA) with the target to produce FAIR Implementation Profiles (FIP).

2/ The produced FAIRness analysis templates for various numerical resources and their application on datasets from the community progressively enriching the FAIRness of the overall ecosystem.

3/ OneWater's FAIR platform mixing

-   well-known and most recent interoperability standards and best practices and their open-source implementations.

-          with the community proven Virtual Research Environment (VRE) lGalaxy and Jupyter Notebook

4/ End-to-end use cases that are already implemented and the methodology to include new ones

5/ The training programme being set up

How to cite: Grellet, S., Squividant, H., Adam, M., Arnaud, F., Braud, I., Bressan, H., Debard, S., Fozzani, J., Chaffard, V., Coussot, C., Trinh, K. A., Le Bras, Y., Lecaudé, E., Maussang, K., Moine, F., Ollagnier, S., Puissant, A., Sudre, J., Valarcher, L., and Vourch, A. and the other OneWater Data project members and associated Water4All partners: OneWater4all platform and ecosystem: where VRE meets FAIR international standards, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20419, https://doi.org/10.5194/egusphere-egu26-20419, 2026.

Open Science Visions and Implementations
09:05–09:15
|
EGU26-19120
|
On-site presentation
Helen Glaves, Joan Maso, Leo Chiloane, Paola de Salvo, Kalamkas Yessimkhanova, Felipe Carlos, and Jean-Philippe Aurambout

In-situ data are part of a complementary suite of Earth observations that are vital for monitoring and understanding our planetary system. In contrast to space-based observations, in-situ data are usually direct, ground-based measurements made in specific and often fixed locations. As such, in-situ measurements are likely to be more precise and are widely considered the “ground truth”.

While satellite-based observing systems provide larger-scale systematic coverage of the Earth’s surface, in-situ data is derived from a diverse range of sources that include observing networks, individual sensors, and even citizen scientists, resulting in a highly heterogeneous data landscape. The diverse nature of in-situ data necessitates a considerable investment of resources in its curation and archiving to ensure its usability and accuracy for specific user applications. It also demands significant data management efforts in terms of standardization, harmonization, and interoperability to effectively consolidate different datasets to fulfill the needs of users.

In an effort to address this highly varied landscape of ground-based observations, the Group on Earth Observations (GEO) is launching its In-Situ Data Strategy. The key objectives being to better understand the in-situ data landscape, including identifying and addressing the barriers to making in-situ data open and accessible for wider reuse. This strategy also aims to foster coordination and sustainability of existing observing networks across different geographical areas and domains, which includes identifying critical gaps in these observing systems and advocating for the development of new monitoring networks where necessary.

The GEO In-Situ Data Strategy emphasizes the need for collaboration on a global scale alongside the adoption of common approaches, standards, and best practices for data management, which are essential for integration, interoperability, and reuse of in-situ data. Through its In-situ Data Strategy, GEO aims to foster a coordinated approach to in-situ data management that makes the data open and accessible with the ultimate goal of delivering “Earth Intelligence for All”

This work has been supported by the GEO-IDEA project funded by the European Environmental Agency (EEA)

 

How to cite: Glaves, H., Maso, J., Chiloane, L., de Salvo, P., Yessimkhanova, K., Carlos, F., and Aurambout, J.-P.: GEO In-Situ Data Strategy: understanding and reducing the barriers to re-use of Earth observation data and knowledge, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19120, https://doi.org/10.5194/egusphere-egu26-19120, 2026.

09:15–09:25
|
EGU26-18443
|
On-site presentation
Melanie Lorenz, Kirsten Elger, Karl Heyer, and Malte Semmler

Open Science is increasingly dependent on collaborative infrastructures that are both discipline-specific and can interoperate across institutional and national boundaries. These infrastructures require coordination mechanisms to balance the characteristics of disciplinary specificity with cross-domain interoperability. In Germany, the Working Group of Specialized Information Services (Arbeitsgemeinschaft der Fachinformationsdienste, AG FID) provides such a collaborative framework by linking discipline-oriented Specialized Information Services (Fachinformationsdienste, FID) into a structured network for exchange, coordination, as well as joint development. In the geosciences, the Specialized Information Service for Geosciences (FID GEO) has supported the research community for almost a decade by providing publication services and consultancy, helping researchers navigate a complex and constantly evolving infrastructure landscape.

FID GEO delivers sustainable publication and data services via established domain repositories. At the same time, FID GEO fosters cultural change through training, community engagement, and active participation in policy and infrastructure development. Collaboration is therefore a cornerstone of FID GEO’s work. It operates in close partnership with geoscientific societies, national infrastructures and initiatives such as the German National Research Data Infrastructure (NFDI). Acknowledging the inherently global nature of the geosciences, FID GEO also aligns its activities with international developments, aiming to synchronize national progress with global standards and best practices for data management and distribution. Acting as an interface between scientists, libraries, repositories and the world of digital data management, FID GEO supports the transformation of the publication culture in the geosciences at national and international levels. These activities, embedded within the AG FID network, clearly benefit from cross-disciplinary exchange, the development of shared standards, and coordinated advocacy. Consequently, their impact is amplified beyond a single community. Specific successes include the increased adoption of FAIR-aligned metadata practices, stronger integration with national infrastructures such as the NFDI, and greater visibility and reusability of geoscientific research outputs.

This contribution provides a critical reflection on the structural challenges shared across the FID system. The ongoing need to adjust to competing funding programs, overlapping infrastructure mandates, and the continuing expectation of “one-stop” platforms/systems means that discipline-specific services must continuously realign their portfolios as responsibilities shift to complementary funding instruments, such as dedicated digitization programs and the NFDI. While this differentiation strengthens the overall research infrastructure ecosystem, it increases the demand for coordination and complicates the long-term maintenance of established services. Rather than striving for monolithic solutions, the FID system demonstrates how distributed services based on the close integration of domain-specific communities attempt to collaborate in finding solutions for interoperable services. These solutions are based on persistent identifiers, shared (metadata) standards, and close stakeholder engagement. This contribution discusses these developments and shares the FID GEO project's experiences with regard to the potentials and challenges of operating open science infrastructures in practice.

How to cite: Lorenz, M., Elger, K., Heyer, K., and Semmler, M.: Operating Open Science Services in Practice - Lessons from the German Specialized Information Service-Infrastructure. , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-18443, https://doi.org/10.5194/egusphere-egu26-18443, 2026.

09:25–09:35
|
EGU26-20634
|
ECS
|
On-site presentation
Dafina Kikaj, Matt Rigby, Joe Pitt, Grant Forster, Kieran Stanley, Ed Chung, Chris Rennick, Dickon Young, Angelina Wenger, Penelope Pickers, Emmal Safi, Karina Adcock, Tom Gardiner, and Simon O’Doherty

Achieving global climate goals requires more than scientific insight, it needs trusted, operational evidence on greenhouse gas (GHG) emissions and how they change as mitigation is put in place. That evidence increasingly comes from combining measurements from many sites and networks, often together with models and inventories. The challenge is not only measuring well, but delivering data that stay consistent over time, are comparable between sites, and are ready for routine use by different communities. Climate action therefore needs an operational level of data: regular releases with clear metadata and uncertainty information.

A key part of this is traceability. Traceability means being able to answer simple questions about every value in a dataset: How was it measured? How was it calibrated? What corrections and quality checks were applied? Which software produced it? What does the uncertainty mean? This becomes especially important over time, because instruments, calibrations, and processing methods evolve, and users need to understand what changed and why.

A practical blueprint will be presented for running traceable GHG and related tracer datasets at scale, based on the day-to-day experience of a large team of measurement scientists, data specialists, and modellers. The blueprint is built around tiered data releases, where products are published at different data levels (raw → quality controlled → derived products), each with uncertainty information appropriate to that level and clear links between levels. A recorded history of processing and version changes is maintained for every release, together with harmonised metadata and uncertainty fields so both people and machines can interpret the data in the same way. Practical operational tools are discussed, such as automated checks, written decision rules, routine reprocessing, and release practices that support stable identifiers and proper credit.

Examples using tracer-based diagnostics, with radon as one example, show how good traceability enables routine, reproducible products that can be used directly in modelling and emissions workflows. The contribution closes with lessons learned on how to keep this working in practice, including coordination, shared standards, and training across teams.

How to cite: Kikaj, D., Rigby, M., Pitt, J., Forster, G., Stanley, K., Chung, E., Rennick, C., Young, D., Wenger, A., Pickers, P., Safi, E., Adcock, K., Gardiner, T., and O’Doherty, S.: Why traceability matters for operational GHG and tracer datasets: lessons for collaborative platforms, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20634, https://doi.org/10.5194/egusphere-egu26-20634, 2026.

Operational Workflows
09:35–09:45
|
EGU26-2808
|
On-site presentation
Clémence Cotten, Mickaël Treguer, Erwan Bodéré, Amandine Thomas, Julien Meillon, and Erwann Quimbert

Earth system science relies on the integration of heterogeneous observations, models and methods across atmosphere, ocean, land and biosphere. In ocean science in particular, the analysis of complex and multidisciplinary datasets strongly depends on scripts, software, virtual research environments (VREs) and scientific services developed within laboratories and research projects. Yet these digital resources often remain scattered, poorly documented and difficult to discover, limiting their reuse, citation and contribution to FAIR and interoperable ocean science.

Within the ODATIS Ocean hub of the French research infrastructure Data Terra, acting as an EOSC node, we developed a concrete and operational solution to address this gap by extending a long-standing national ocean data catalogue towards a FAIR catalogue of ocean-related scientific resources. ODATIS has historically relied on GeoNetwork, the Sextant platform and the ISO 19115 standard to catalogue ocean datasets. Building on this foundation, we adapted the ISO 19115 metadata model and the Sextant catalogue to describe a wider range of resources relevant to ocean science, including scripts, software, applications, VREs, scientific support services and training materials, while remaining fully interoperable with existing data catalogues.

This development was carried out within the Gaia Data project of Data Terra, in close collaboration across thematic poles, leading to the co-design of controlled vocabularies dedicated to development languages, resource sub-types and data life-cycle stages. These shared semantic artefacts enrich metadata with machine-actionable information, enable faceted discovery, and strengthen semantic interoperability within and beyond the ocean community.

The resulting ODATIS resource catalogue is now operational and used in several national initiatives, including the PEPR BRIDGES programme and Gaia Data activities such as the “support to oceanographic campaigns” task, which develops a portfolio of services for cruise principal investigators. The catalogue provides a national entry point to assign DOIs to scripts and software required for scientific publications, while offering a visible showcase for ocean-related tools, services and VREs developed by laboratories, support teams and projects.

Beyond discovery and citation, the catalogue is designed as a foundation for a knowledge graph linking ocean datasets, processing tools, computational environments, services and training resources. This supports interdisciplinary and end-to-end use cases, connecting data to the methods and VREs required for their analysis. Community engagement is ensured through a coordinated outreach to ODATIS laboratory correspondents across France, fostering co-design, adoption and the creation of resource records. Altogether, this success story demonstrates how metadata standards, semantic interoperability and VRE-oriented cataloguing can effectively support FAIR and collaborative ocean science.

How to cite: Cotten, C., Treguer, M., Bodéré, E., Thomas, A., Meillon, J., and Quimbert, E.: Extending Data Catalogues to FAIRly Describe and Cite Ocean Science Resources: The ODATIS Experience, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2808, https://doi.org/10.5194/egusphere-egu26-2808, 2026.

09:45–09:55
|
EGU26-20056
|
On-site presentation
Jonas Sølvsteen, Aimee Barciauskas, Alex Mandel, Anthony Boyd, Brianna Corremonte, Emmanuel Mathot, Felix Delattre, Kyle Barron, and Pete Gadomski

Development Seed and our partner’s vision is that FAIR (Findable, Accessible, Interoperable, and Reusable) data, a key ingredient to open science, is not an afterthought, but how scientists handle their data in the first place.

Our experience is that appealing, powerful, and free tools for data discovery and access that are readily available incentivise data science practitioners to organise their data in interoperable formats and cataloguing as part of their workflows. 

This talk gives an overview of recent advances in open source tools that Development Seed and our partners are supporting and our experience with their application in platforms such as those powered by the ESA-funded EOEPCA+ software, sister projects NASA/ESA MAAP and NASA VEDA, and the data platform source.coop.

After more than a decade of success with cloud-optimised data formats such as Cloud Optimised GeoTIFF (and recently also Zarr-based variants) and dynamic server-side rendering of data, latest advances focus on client-side access to STAC catalogues in geoparquet format and GPU-powered rendering of web-optimised datasets in the browser, to make the benefits available also in environments where centralised services are not available, and reduce the infrastructure maintenance costs.

How to cite: Sølvsteen, J., Barciauskas, A., Mandel, A., Boyd, A., Corremonte, B., Mathot, E., Delattre, F., Barron, K., and Gadomski, P.: Incentivising open science through powerful free and open tooling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20056, https://doi.org/10.5194/egusphere-egu26-20056, 2026.

09:55–10:05
|
EGU26-17776
|
Virtual presentation
Caroline Ball, Riana Rasoldier, Caroline Vateau, Charlotte Garrel, and Nicolas Estival

The Copernicus Data Space Ecosystem (CDSE) has transformed Earth Observation data access cloud-based, multi-access services, replacing traditional download-heavy approaches. By 2024, CDSE provided over 78 PB of data to more than 290,000 users spanning scientific, institutional, and commercial sectors. This rapid expansion, however, raises sustainability questions due to the environmental impact of data transmission, storage, and processing. This study evaluates end-user contributions to the CDSE's environmental impact and explores strategies to reduce emissions across the value chain.

The approach combines qualitative and quantitative analysis across 3 phases:

Phase 1 – Preparation and Validation

The first phase involves defining the scope of the study, validating user typologies (scientists, industry, institutions, start-ups), confirming methodological standards, and developing tools for surveys and interviews.

Phase 2 – Data collection and user insights: Data will be gathered through 3 complementary channels:

  • A GDPR-compliant questionnaire targeting diverse typologies
  • In-depth discussions to capture decision-making processes and sustainability trade-offs.
  • Existing data sources: Bibliographic research, statistical reports, and operational data from CDSE platforms to quantify download volumes, compute operations, and storage patterns.

Phase 3 – Impact calculation and modelling

Impacts will be evaluated for each CDSE end-user type by examining 3 main areas: data transfer (network traffic), data processing (computing tasks), and data storage (including backup and retention). Both cloud hosting and on-premises systems will be analyzed.

The calculation process commences with the collection of key inputs, which include:

  • Cloud resource consumption: Data from CDSE operations and cloud providers (compute, storage, data transfer).
  • Technical specification of instances used: CPU, GPU, memory, storage type, PUE, etc.
  • User behavior data: Collected via surveys and interviews, considering the amount of data, the types of processing performed, how long data is stored, redundancy measures, and involvement of third parties.

Our methodology is adapted from the Cloud Carbon Footprint approach and established standards for assessing the environmental impact of cloud services and data centers, tailored specifically for Copernicus. This evaluation covers 2 emissions:

  • Embodied Emissions related to manufacturing and maintaining servers and storage devices

Manufacturing emissions × (usage time ÷ lifespan) × (reserved resources ÷ available resources).

  • Operational Emissions caused by resource use during data processing.

Time × total component power × efficiency × electricity carbon intensity.

The calculation uses quantitative data, standard guidelines, established methodologies, and databases such as EcoInvent, EEA, Negaoctet, and Resilio.

Complementing this, a multicriteria LCA approach, following European guidelines and standards, offers a comprehensive view. Key indicators considered are electricity consumption, GHG emissions, primary energy consumption (to capture total energy demand), water consumption (linked to cooling and infrastructure) and abiotic resource depletion (impact of raw material extraction for hardware).

To achieve representative results, sampled typologies are extrapolated to all users using factors such as average footprint, CDSE registration data, infrastructure location, and storage and processing scenarios.

The study will provide:

  • Carbon footprint for each end-user type
  • Hotspot identification in user types and infrastructure
  • Emission reduction recommendations

How to cite: Ball, C., Rasoldier, R., Vateau, C., Garrel, C., and Estival, N.: Methodology for the Qualitative and Quantitative Analysis of the Environmental Impact of Copernicus Data Space Ecosystem (CDSE) End-Users , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17776, https://doi.org/10.5194/egusphere-egu26-17776, 2026.

FAIR Perspektives
10:05–10:15
|
EGU26-22886
|
Highlight
|
On-site presentation
Shelley Stall, Danielle Kinkade, and Natalie Raia
Innovation within the scientific enterprise is maximized when researchers are supported with tools, guidance, and infrastructure, including data that are as open and FAIR as possible. In recognition of this fact, funders, institutions, and scholarly publishers are imposing increasing expectations for sharing research data and software. Researchers responding to these requirements face conflicting workflows, timing, and a myriad of data archiving choices; they are unknowingly caught in a “FAIR data crisis”. Additionally, researchers don’t yet have trust that these same archive choices could be their first stop in finding new and interesting datasets. To unleash transformative research of the future, vetted disciplinary-specific science support frameworks are needed for both archive deposition as well as discovery of new datasets. 
 

This presentation will introduce a newly funded project aimed at building a sustainable community resource for three disciplines to drive their community toward a shared vision of common research data resources, methods and tools that are grounded in Open Science and FAIR data principles. Each discipline-specific framework will coalesce existing resources and efforts, and through adoption, deliver: 1) Consolidated, vetted community resources identified in partnership with respective members; 2) Interoperable data that are machine-actionable supporting discovery, trust, and reuse; 3) A discipline-specific leadership and sustainable governance that intentionally fosters development of data management skills, Open Science, and FAIR data. Thus, this work will realize the value of Open Science practices by putting community-vetted resources at the heart of where researchers share their research and connect to their colleagues - society communities and meetings.

How to cite: Stall, S., Kinkade, D., and Raia, N.: GeoFAIR - All are Welcome!, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22886, https://doi.org/10.5194/egusphere-egu26-22886, 2026.

Posters on site: Fri, 8 May, 16:15–18:00 | Hall X4

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Fri, 8 May, 14:00–18:00
Chairpersons: Marie Jossé, Jérôme Détoc
X4.71
|
EGU26-1682
Eileen Hertwig, Andrea Lammert, and Andrej Fast

Efficient long-term data archiving is essential for advancing climate research, where increasingly complex simulations generate vast and heterogeneous datasets that must remain accessible, traceable, and reusable across disciplines and timescales. To meet the diverse, user-specific needs of Earth System Sciences (ESS) researchers, the German Climate Computing Center (DKRZ) provides two complementary archival systems: WDCC and DOKU. 

The World Data Center for Climate (WDCC) serves as a formal, FAIR-aligned repository for climate model outputs and related datasets. It assigns persistent identifiers via DataCite DOIs, preserves rich and standardized metadata, and ensures interoperability, thereby promoting data sharing and supporting long-term scientific reuse. Mature datasets intended for public dissemination and sustained reuse therefore fall in the scope of WDCC. DOKU, by contrast, is a lightweight, flexible solution tailored to project- and user-specific requirements within DKRZ. It offers structured long-term storage - typically guaranteed for ten years - for data that are not (yet) ready for formal publication but remain important for internal reference, validation, or project continuity.

A central question for researchers is however not only where but also which data should be kept, and for how long. While DOKU and WDCC each provide a baseline retention period of ten years, it is worth considering whether time alone is really the decisive criterion for preservation. Instead appraisal could consider scientific value, future potential for reuse, the cost of regeneration, and broader considerations of good scientific practice. 

Starting with DOKU, appraisal criteria and workflows are currently being developed to determine the fate of archived data once the guaranteed retention period has expired. Data that continue to play a significant role within their project or show clear potential for reuse might stay longer in DOKU while others might need to go. Once this workflow has been established and tested it could also serve as a blueprint for WDCC. 

Together, WDCC and DOKU form a coherent strategy for sustainable data management, providing scalable infrastructure, FAIR principles, and software solutions that meet specific user needs while supporting responsible, long-term stewardship of climate data across the ESS community.

How to cite: Hertwig, E., Lammert, A., and Fast, A.: Beyond Retention Periods: Appraising Climate Data Across Complementary Archives, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1682, https://doi.org/10.5194/egusphere-egu26-1682, 2026.

X4.72
|
EGU26-2668
|
ECS
Myungjun Lee, Jaeung Han, Gapho Jeun, and Min-A Kim

To support academic and public applications—particularly those related to disaster monitoring, urban development, and environmental studies in specific regions and events—the Korea Aerospace Research Institute (KARI) has operated the Korea Satellite Information Database (KSATDB) web service since 2017. The platform is designed to enhance the usability of Korea Multi-Purpose Satellite (KOMPSAT) imagery across various domains by selectively releasing high-resolution, tile-formatted images.
To further improve public accessibility and foster scientific research in remote sensing and geophysical applications, KARI is developing an enhanced KSATDB platform over a four-year period (2025–2028). This initiative aims to broaden open access to high-resolution satellite data and to establish an integrated infrastructure for data acquisition, processing, and dissemination.
The upgraded system consists of two principal components: the Order Request and Distribution Subsystem (ORDS), which enables orbit-based image acquisition and rapid data delivery, and the Satellite Public Service Subsystem (SPSS), which provides free access to high-resolution imagery for academic and research purposes. The SPSS is composed of two functional modules: “A base processing module” and “A web-based visualization module”.
The “base processing module” generates multi-channel, high-resolution tile datasets for disaster-affected regions, representative nature sceneries, and interesting big-cities, incorporating geometric and radiometric corrections to ensure data accuracy and consistency. The “web-based visualization module” supports real-time rendering of processed tile data in response to user interactions. It integrates advanced visualization tools, including Curtain-view and Geo-linkage comparison functions, which facilitate intuitive analysis of temporal variations.
Furthermore, the platform incorporates a spectral synthesis function to support environmental and geophysical analyses through the web interface. Through this enhancement of KSATDB, KARI aims to advance both academic research and practical applications of satellite imagery in diverse disaster-related studies and to contribute to the broader scientific understanding of remote sensing and geophysical phenomena.

How to cite: Lee, M., Han, J., Jeun, G., and Kim, M.-A.: Design of a new KARI Satellite Image Service Platform, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2668, https://doi.org/10.5194/egusphere-egu26-2668, 2026.

X4.73
|
EGU26-3613
Chris Hagerbaumer, Colleen Rosales, and Russell Biggs

OpenAQ is the world's largest open-source, open-access air quality data platform. It provides over 2 billion measurements from more than 22,500 sources across 142 countries. OpenAQ offers a standardized and harmonized approach to accessing diverse air quality data from a wide variety of air sensors and reference-grade monitors. The platform enhances data findability, accessibility, interoperability, and reusability for various research and application needs.

By openly sharing air quality data, OpenAQ maximizes the value of the data collected, leveraging the skills of interested parties in and beyond the community to produce the scientific research, communications, and evidence-based solutions needed to reduce air pollution.

OpenAQ trains groups on how to use the platform, prioritizing those working to reduce air pollution in vulnerable communities, and OpenAQ provides leadership training for emerging air quality leaders in low- and middle-income countries through its Clean Air Community Ambassador Program.

This session provides an overview of OpenAQ’s work to democratize access to air quality data, including such tools, resources and programs as: 

  • OpenAQ Explorer (https://explore.openaq.org): A user-friendly tool for visualizing and downloading harmonized air quality data from a global map
  • Programmatic Access: Advanced options for data integration that are available through the OpenAQ API and accessible via an OpenAQ R client and Python SDK
  • AQI Hub (https://aqihub.info/): A resource for understanding and comparing how different countries report Air Quality Indices (AQIs)
  • Clean Air Community Ambassador Program (https://ambassadors.openaq.org/): a program to empower the next generation of changemakers to use air quality data in support of community action

 

How to cite: Hagerbaumer, C., Rosales, C., and Biggs, R.: Open-Source Data and Digital Platforms: Catalyzing Scalable & Collaborative Solutions for Clean Air through OpenAQ, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-3613, https://doi.org/10.5194/egusphere-egu26-3613, 2026.

X4.74
|
EGU26-6042
|
ECS
Andrey Santos, Camyla Santos, Eliana Bahia, Isaias Bianchi, and Rafael Moré

The field of Geosciences has made significant contributions to the advancement of the Open Science movement. As a data-intensive domain, its engagement with this movement has fostered the expansion and use of FAIR data, open data, open and reproducible research practices, as well as open access to scientific publications and to the databases produced within the field. This study aims to visualize and analyze the contribution of Open Science research in Geosciences to other fields of knowledge, based on the citations received by these publications. To this end, a bibliometric analysis was conducted on documents indexed in the Scopus database, selected for its consistent classification of Subject Areas, particularly those limited to Environmental Science, Earth and Planetary Sciences, and Agricultural and Biological Sciences. The search strategy yielded 2,892 publications related to Open Science in the context of Geosciences. The analysis of citing documents revealed that 27 fields of knowledge refer to these publications, with particular prominence given to Social Sciences, Computer Science, Engineering, and Energy. Additionally, keyword co-occurrence and co-authorship analyses were performed in order to identify the main research themes, patterns of scientific collaboration, and core clusters of intellectual production associated with Open Science in Geosciences. These procedures made it possible to highlight the interdisciplinary nature of the field and the role of Geosciences as a vector for the diffusion of open practices across other areas of knowledge. It is concluded that Open Science research developed within Geosciences exerts a significant influence on multiple scientific domains, contributing to the consolidation of collaborative, transparent, and data-sharing-oriented research practices.

How to cite: Santos, A., Santos, C., Bahia, E., Bianchi, I., and Moré, R.: The Contribution of Open Science Research in Geosciences to Other Fields of Knowledge: A Bibliometric Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6042, https://doi.org/10.5194/egusphere-egu26-6042, 2026.

X4.75
|
EGU26-7484
|
ECS
Pablo López-Díaz, Luca D'Auria, Sergio de Armas-Rillo, Aarón Álvarez-Hernández, David M. van Dorth, Rubén García-Hernández, Manuel Calderón-Delgado, Víctor Ortega-Ramos, and Nemesio M. Pérez

Modern volcanic monitoring requires managing multidisciplinary, multiparametric, large-volume datasets. A robust digital framework is therefore essential for integrating, managing, storing, processing, and visualizing these data streams consistently. Here, we present such a framework, comprising a SQL-based database, a Flask web application, and an automated scheduler that ensures continuous data ingestion and updating. 

Within the DIGIVOLCAN project, led by the Instituto Volcanológico de Canarias (INVOLCAN), we developed a multiparametric database to support volcano monitoring in the Canary Islands. The database integrates data from permanent monitoring networks, discrete field surveys, and remote sensing, and is implemented using PostgreSQL. Serving as the core of the framework, the database is optimized with indexed tables that enable rapid querying, even for datasets with millions of records. Spatial data are handled with PostGIS, a PostgreSQL extension that provides efficient spatial data storage and operations. In contrast, time-series data are managed with TimescaleDB, which significantly accelerates time-series queries. Together, these technologies ensure secure storage, high performance, and seamless interaction with the DIGIVOLCAN web interface, enabling rapid visualization of large, complex datasets. 

This digital infrastructure is designed to serve multiple user communities and operational needs. It provides accessible, high-level information to the general public, more detailed datasets to civil protection authorities, and comprehensive, multiparametric analyses to scientific committees during seismo-volcanic crises. The system functions as both an operational tool for routine daily monitoring and a rapid-response platform during volcanic emergencies, delivering advanced maps and time-series visualizations. In addition, it serves as a scientific research tool by facilitating the integrated analysis and comparison of geophysical and geochemical datasets, including compatibility with advanced AI-based data analysis workflows. 

Access to the database is provided through a web portal that implements role-based access control. At the basic level, intended for the general public, users can explore an interactive, real-time earthquake map of the Canary Islands with customizable filters for intuitive visualization. Higher access levels unlock additional functionality, allowing advanced users to visualize thematic maps and time-series plots of key volcano-monitoring parameters. These include, for example, gravimetry, ground deformation from GNSS and satellite interferometry, self-potential data, discrete and continuous diffuse soil gas fluxes (CO₂, H₂S), and numerous other raw and processed geophysical and geochemical variables. 

Finally, the modular architecture of the infrastructure enables straightforward expansion and long-term evolution, supporting the integration of new monitoring parameters as well as the development of additional map types and graphical representations within the web interface. 

How to cite: López-Díaz, P., D'Auria, L., de Armas-Rillo, S., Álvarez-Hernández, A., M. van Dorth, D., García-Hernández, R., Calderón-Delgado, M., Ortega-Ramos, V., and M. Pérez, N.: DIGIVOLCAN: a multi-parametric database for the volcano monitoring of the Canary Islands (Spain) , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7484, https://doi.org/10.5194/egusphere-egu26-7484, 2026.

X4.76
|
EGU26-9081
Christin Henzen, Nadia Aouadi, Anna Brauer, Robert Brylka, Auriol Degbelo, Jonas Grieb, Ralf Klammer, Markus Konkol, Roland Koppe, Kemeng Liu, Tom Niers, Daniel Nüst, and Alexander Wellmann

Research data management (RDM) in the Earth system sciences is complex and can be frustrating. Data come in many shapes and formats—observations, model outputs, samples, derived products—and are spread across a wide range of repositories, services, and software ecosystems. These infrastructures differ greatly in metadata quality, interoperability, and FAIR maturity. For researchers, this often means spending too much time figuring out where to publish data, how to describe it properly, or how to reuse existing datasets and software. At the same time, expectations from funders and journals on data outputs continue to rise with respect to openness, curation, and long-term stewardship.  

Within the German National Research Data Infrastructure (NFDI), the NFDI4Earth consortium tackles these challenges by building practical, community-driven solutions for the Earth system sciences. As developers and designers, our focus is on lowering barriers and making FAIR data and software practices easier to understand and apply in everyday research. In this contribution, we introduce two key building blocks of this effort: the Knowledge Hub and the OneStop4All. 

The Knowledge Hub (https://knowledgehub.nfdi4earth.de) is a knowledge graph that connects heterogeneous Earth system resources, e.g., datasets, repositories, services, software, and educational materials, using a harmonized metadata model. It harvests metadata from multiple providers—ranging from global aggregators to national and domain-specific services (see, for instance, the Helmholtz DataHub: https://earth-data.de/) - and exposes them through a well-defined SPARQL API. This allows both humans and machines to query, explore, and reuse metadata consistently, and enables developers to build custom applications on top of it. 

The OneStop4All (https://onestop4all.nfdi4earth.de) builds on the Knowledge Hub to offer a discovery and guidance portal for researchers. It brings together the resources in a single, coherent interface. Cross-domain search and guided navigation help users move along the research data lifecycle without having to know all standards and infrastructures upfront. A central feature is the Repository Wizard, an interactive decision-support tool that helps researchers find suitable repositories for publishing their data based on data type, discipline, and policy constraints. In addition, the NFDI4Earth Label provides a transparent, community-oriented way to communicate repository quality with respect to FAIR principles, sustainability, and relevance for Earth system sciences. 

Beyond discovery, the OneStop4All puts a strong emphasis on learning and cultural change. It provides integrated access to open educational resources, good-practice guides, and showcases that demonstrate the concrete benefits of FAIR and open data and software. A domain-specific chatbot complements these resources by answering practical questions on metadata, licensing, data publication, and software citation. 

We will showcase these services, share lessons learned from development, and highlight opportunities for community contributions. From our perspective as a distributed team of designers and software developers, combining harmonized metadata, user-centered services, and hands-on training is key to making FAIR and open research practices work at scale in the Earth system sciences. 

How to cite: Henzen, C., Aouadi, N., Brauer, A., Brylka, R., Degbelo, A., Grieb, J., Klammer, R., Konkol, M., Koppe, R., Liu, K., Niers, T., Nüst, D., and Wellmann, A.: From Data Rocks to FAIR Peaks: With NFDI4Earth’s services towards Harmonized Metadata and User-Centered Tools for Earth System Research , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9081, https://doi.org/10.5194/egusphere-egu26-9081, 2026.

X4.77
|
EGU26-9280
Dorian Ginane and Quentin Bialota

During the FAIR-EASE project, which aims to deliver integrated and FAIR-compliant services for Earth and environmental sciences, a key challenge emerged: enabling interoperable and transparent data processing across disciplines and Virtual Research Environments (VREs). Earth system science increasingly relies on complex workflows combining heterogeneous data, models, and tools that often remain confined within  technical silos and domain-specific environments, limiting cross-disciplinary reuse and collaboration.

Galaxy, a widely adopted open-source platform for FAIR data analysis, plays a central role within FAIR-EASE. It provides strong capabilities for sharing, executing, and reproducing scientific workflows. However, while it excels at sharing and executing scientific processes,Galaxy remains difficult to integrate into the broader geospatial ecosystem. Its native API is not aligned with the standards commonly used by geospatial and Earth Observation communities, creating a significant barrier to interoperability with external tools and platforms.

This limitation directly affects the "I" in FAIR (Interoperability) when connecting Galaxy-based VREs to other environments. While the geospatial community, primarily relies on Open Geospatial Consortium (OGC) standards—such as Web Processing Service (WPS) and OGC API Processes—or on community-driven standards like OpenEO, Galaxy exposes its processing through a specific API that is not natively understood outside its ecosystem.  As a result, Galaxy workflows, although FAIR within their own environment, remain partially isolated from standard-based geospatial infrastructures.

To address this gap, Geomatys focused during the FAIR-EASE project on enabling Galaxy workflows to be exposed through widely adopted geospatial standards used by the entire community. Our approach relies on the open-source geospatial platform Examind Community, which acts as a standards-compliant gateway between Galaxy and external clients. By mapping Galaxy workflows to OGC API Processes and WPS, users can discover, configure, and execute Galaxy workflows using widely familiar and widely used geospatial interfaces. This interoperability layer was subsequently extended to support the OpenEO standard, enabling Earth Observation users to access Galaxy workflows through an API increasingly adopted across EO (Earth Observation) community.

To further simplify this ecosystem, we initiated the development of a lightweight bridge based on FastAPI (Python). This micro-service provides a transparent translation layer between the Galaxy API and the OGC Processes or OpenEO APIs. Designed to be modular and easy-to-deploy, it offers a pragmatic solution for institutions wishing to expose their Galaxy instances to the geospatial ecosystem without the overhead of a full-scale infrastructure. 

By normalizing access to Galaxy workflows through standard interfaces, FAIR-EASE demonstrates how VRE-powered access can be achieved in practice. This work significantly broadens the user community of Galaxy and enables researchers to integrate Galaxy-based processing into external tools and workflows. This concrete solution demonstrates how "VRE-powered access" can be achieved by leveraging existing standards to eliminate technical barriers, and our experience highlights that advancing interoperable Earth system science does not require creating new platforms but rather building robust bridges between mature existing tools, such as Galaxy, and the ecosystem of existing geospatial standards.

How to cite: Ginane, D. and Bialota, Q.: Bridging Galaxy and Geospatial Standards: Enabling Interoperable VRE Workflows for Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9280, https://doi.org/10.5194/egusphere-egu26-9280, 2026.

X4.78
|
EGU26-9951
Christopher Kadow, Martin Bergemann, Mostafa Hadizadeh, Manuel Reis, and Etor Lucio Eceiza

The discoverability and access of climate and Earth-system datasets is foundational for effective scientific analysis workflows, yet these datasets are often hosted across diverse storage systems and follow a variety of organisational conventions. Researchers and infrastructure engineers face challenges in ingesting distributed metadata into unified, searchable catalogues without sacrificing interoperability or scalability. Efficient metadata harvesting, normalisation, and ingestion at scale are therefore critical enablers for data discovery and FAIR (Findable, Accessible, Interoperable, and Reusable) data practices.

To address this need, we present the  Metadata Crawler, a metadata ingestion tool within the designed to automate the collection and indexing of climate dataset metadata across heterogeneous storage backends. The Metadata Crawler supports multi-backend discovery, including POSIX file systems, S3/MinIO object stores, and OpenStack Swift, enabling infrastructure administrators to aggregate metadata from local archives, cloud object storage, and institutional repositories.

 

At its core, the Metadata Crawler implements a two-stage pipeline: harvested metadata are first collected into a temporary catalogue, and then indexed into downstream systems such as Apache Solr or MongoDB. Dataset definitions, directory structures, and extraction logic are governed by a flexible TOML configuration that encodes Data Reference Syntax (DRS) dialects for different standards. Users can make use of pre-defined standards or define their own, making the tool extremely flexible and versatile. This schema-driven approach, combined with path and data specifications, conditional rules, and computed fields, ensures consistent representation of key facets such as temporal coverage, geospatial bounds, and variables and many other metadata specifications.

The tool provides both a command-line interface (CLI) and a Python API, supporting synchronous and asynchronous execution as well as multi-threaded crawling, facilitating integration into operational workflows. By normalising and indexing previously siloed metadata into searchable catalogues, the Metadata Crawler enhances data findability and empowers portals and analysis platforms to deliver efficient discovery services. Its modular design also allows deployment in diverse environments and easy extension to additional backends or indexing targets.

How to cite: Kadow, C., Bergemann, M., Hadizadeh, M., Reis, M., and Lucio Eceiza, E.: Accelerating Data Discovery: Automated, Scalable Harvesting and Indexing of Metadata Across Heterogeneous Storage Backends, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-9951, https://doi.org/10.5194/egusphere-egu26-9951, 2026.

X4.79
|
EGU26-12439
Ivonne Anders, Klaus Getzlaff, Sören Lorenz, and Hela Mehrtens

Interdisciplinary research in the Earth system sciences increasingly raises complex questions related to research data management (RDM) that go beyond the scope of local, discipline-specific support services. While institutional RDM support remains the first point of contact for many researchers, cross-cutting issues often require coordinated expertise across institutions and domains.

Within NFDI4Earth, a User Support Network (USN) is being established to address these challenges. At its core, the USN currently consists of a team of support experts from ten partner institutions, all part of the NFDI4Earth consortium. User requests are handled via a central ticket system, allowing coordinated responses and transparent workflows. For more specialised or domain-specific questions, the core team draws on an emerging expert network within NFDI4Earth.

Rather than aiming for a fully distributed support structure from the outset, the USN follows an incremental approach. Existing expertise is consolidated in a central entry point, while the expert network is gradually expanded. A key objective is to strengthen links to local RDM services at partner institutions and to establish closer connections with user support and helpdesk initiatives across other NFDI consortia. In this way, the USN aims to evolve into a well-connected component of a broader NFDI-wide support landscape.

In this contribution, we present the current structure, workflows and first experiences of the NFDI4Earth User Support Network. We also invite researchers to make use of the service for their RDM-related questions and encourage RDM support teams and NFDI initiatives to engage with us, share expertise and help shape a connected, sustainable support landscape for Earth system sciences.

How to cite: Anders, I., Getzlaff, K., Lorenz, S., and Mehrtens, H.: From questions to expertise: the cross-disciplinary user support network in NFDI4Earth, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12439, https://doi.org/10.5194/egusphere-egu26-12439, 2026.

X4.80
|
EGU26-12693
Bethan Perkins, Brian Matthews, Tom Kirkham, and Sarah Byrne

The DAFNI Platform (Data and Analytics Facility for National Infrastructure) is a Virtual Research Environment which stores data and software models, provides an execution environment for those models, and supports data visualisation. Formally launched in July 2021, the DAFNI Platform was built to support national infrastructure research. 

Infrastructure systems supplying water and energy, transport networks, communication networks, and waste management provide the backbone of modern societies and play a key role in the development of nations and communities. The infrastructure research community is diverse, and collaboration between domains is complex, with different development teams using different data and programming standards.  Further, differences in data formats, in spatial and temporal resolution, and in data semantics make it complex to work together and combine models for integrated impact assessments. 

Infrastructure systems cannot, however, be considered in isolation. The interactions between them need to be considered to determine their most effective design and operation. Furthermore, the need for resilience and adaptation to climate change must be examined by all domains. 

To support these heterogeneous research communities working together, the DAFNI Platform was built with flexibility at its core. Containerisation, object stores and high-level metadata vocabularies are some of the key technical aspects to this flexibility, along with a domain-agnostic user interface. When uploading data to the platform, users may upload any file format which is then stored without transformation in an object store. Software models are containerised and uploaded to the DAFNI Platform as Docker images, where they can then be executed on DAFNI as a workflow. Using containerisation, it is possible to combine models together in sequence irrespective of the language that they were written in or the OS on which they were created, allowing researchers to continue using their established practices. 

The decisions to create services which are domain-agnostic have also necessitated certain trade-offs, however, which may not apply to more specialised platforms. For example, while a high-level metadata schema can be applied to e.g. rail timetables as well as it can to flood extent data, it does not support accepted standards or ontologies in either rail or flooding research. Object stores, Docker containers, and neutral user interfaces also come with their own challenges.  

Despite these challenges, however, the DAFNI Platform offers a unique capability and has successfully supported many projects using complex model and data interactions. One prominent example of this is the OpenCLIM project which used the DAFNI Platform to develop workflows linking different human and infrastructure systems to environmental data - such as linking urban development with climate-driven rainfall changes and flooding - and continues to showcase this research on DAFNI beyond project lifetime. 

This presentation will showcase the DAFNI Platform’s functionality, explain key design decisions, and illustrate its impact through examples of research enabled by the platform. We will also reflect on lessons learned in building a VRE for a multidisciplinary domain and discuss implications for future infrastructure research environments. 

How to cite: Perkins, B., Matthews, B., Kirkham, T., and Byrne, S.: DAFNI: Building a VRE for National Infrastructure, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12693, https://doi.org/10.5194/egusphere-egu26-12693, 2026.

X4.81
|
EGU26-17347
David Schäfer, Nils Brinckmann, Florian Gransee, Tobias Kuhnert, Ralf Kunkel, Christof Lorenz, Peter Lünenschloß, Bert Palm, Thomas Schnicke, and Jan Bumberger

Research Data Infrastructures (RDIs) in Earth System Science must balance FAIR-compliant data management, operational requirements, and the practical needs of researchers operating heterogeneous sensor networks at scale. These design goals are not always fully aligned and may even conflict in operational environments. At the EGU General Assembly 2025, we introduced a modular digital ecosystem for time series data management designed to address these challenges. One year later, we report on the transition from prototype deployment to sustained operational use and reflect on how user feedback and operational constraints shaped the system’s evolution.

The ecosystem has since been deployed as a production infrastructure at the Helmholtz Centre for Environmental Research - UFZ, where it currently supports approximately 20 research projects. The system manages around three billion observations from diverse sensor networks, with temporal resolutions of up to 5 seconds. This operational setting exposed challenges that were not fully apparent at the design and implementation stages, particularly regarding the scalability of data integration workflows, robustness under continuous load, and the interaction between metadata management, data ingestion, and automated quality control.

The ecosystem comprises three modular components: the Sensor Management System – SMS [1] for standardized metadata registration, the time.IO [2] platform for storage, transfer, and visualization of time series data, and the System for Automated Quality Control – SaQC [3] for automated data analysis and quality assurance. While the modular design enabled reuse and interoperability, early operational phases revealed scaling bottlenecks that led to service outages, necessitating substantial refinements of ingestion pipelines, deployment strategies, and monitoring mechanisms.

User-centric development also played a central role in stabilizing and extending the infrastructure. Continuous feedback from active projects influenced interface design, automation levels, and operational workflows, highlighting the importance of iterative co-design in bridging the gap between conceptual design goals and sustainable, user-accepted operation. We summarize key lessons learned from one year of operational use and discuss implications for building and operating sustainable, interoperable RDIs that effectively support Earth system science across disciplines and scales.

[1] Lorenz, C., Brinckmann, N., Bumberger, J., Hanisch, M., Kuhnert, T., Loup, U., Moorthy, R., Obersteiner, F., Schäfer, D., Schnicke, T. (2025). Sensor Management System (SMS): Open-source software for FAIR sensor metadata management in Earth system sciences. SoftwareX (submitted), https://arxiv.org/abs/2512.17280

[2] Bumberger, J., Abbrent, M., Brinckmann N., Hemmen, J., Kunkel, R., Lorenz, C., Lünenschloß, P., Palm, B., Schnicke, T., Schulz, C., van der Schaaf, H., and Schäfer, D. (2025). Digital Ecosystem for FAIR Time Series Data Management in Environmental System Science. SoftwareX, 102038, https://doi.org/10.1016/j.softx.2025.102038

[3] Schmidt, L., Schäfer, D., Geller, J., Lünenschloss, P., Palm, B., Rinke, K., Rebmann, C., Rode, M., & Bumberger, J. (2023). System for automated Quality Control (SaQC) to enable traceable and reproducible data streams in environmental science. Environmental Modelling & Software, 105809. https://doi.org/10.1016/j.envsoft.2023.105809

How to cite: Schäfer, D., Brinckmann, N., Gransee, F., Kuhnert, T., Kunkel, R., Lorenz, C., Lünenschloß, P., Palm, B., Schnicke, T., and Bumberger, J.: From Architecture to Operation: time.IO a User-Centric Digital Ecosystem for Time Series Data Management in Earth System Science, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17347, https://doi.org/10.5194/egusphere-egu26-17347, 2026.

X4.82
|
EGU26-21771
Valeriu Predoi, Bouwe Andela, and Birgit Hassler

ESMValTool is a software tool for analyzing data produced by Earth System Models (ESMs) in a reliable and reproducible way. It provides a large and diverse collection of “recipes” that reproduce standard, as well as state-of-the-art analyses. ESMValTool can be used for tasks ranging from monitoring continuously running ESM simulations to analysis for scientific publications such as the IPCC reports, including reproducing results from previously published scientific articles as well as allowing scientists to produce new analysis results. To make ESMValTool a user-friendly community tool suitable for doing open science, it adheres to the FAIR principles for research software. It is: - Findable - it is published in community registries, such as https://research-software-directory.org/software/esmvaltool; - Accessible - it can be installed from Python package community distribution channels such as conda-forge, and the open-source code is available on Zenodo with a DOI, and on GitHub; - Interoperable - it is based on standards: it works with data that follows CF Conventions and the Coupled Model Intercomparison Project (CMIP) Data Request, its reusable recipes are written in YAML, and provenance is recorded in the W3C PROV format. It supports diagnostics written in a number of programming language, with Python and R being best supported. Its source code follows the standards and best practices for the respective programming languages; - Reusable - it provides a well documented recipe format and Python API that allow reusing previous analyses and building new analysis with previously developed components. Also, the software can be installed from conda-forge and DockerHub and can be tailored by installing from source from GitHub. In terms of input data, ESMValTool integrates well with the Earth System Grid Federation (ESGF) infrastructure. It can find, download and access data from across the federation, and has access to large pools of observational datasets. ESMValTool is built around two key scientific software metrics: scalability and user friendliness. An important aspect of user friendliness is reliability. ESMValTool is built on top of the Dask library to allow scalable and distributed computing, ESMValTool also uses parallelism at a higher level in the stack, so that jobs can be distributed on any standard High Performance Computing (HPC) facility; and software reliability and reproducibility - our main strategy to ensure reliability is modular, integrated, and tested design. This comes back at various levels of the tool. We try to separate commonly used functionality from “one off” code, and make sure that commonly used functionality is covered by unit and integration tests, while we rely on regression testing for everything else. We also use comprehensive end-to-end testing for all our “recipes” before we release new versions. Our testing infrastructure ranges from basic unit tests to tools that smartly handle various file formats, and use image comparison algorithms to compare figures. This greatly reduces the need for ‘human testing’, allowing for built-in robustness through modularity, and a testing strategy that has been tailored to match the technical skills of its contributors.

How to cite: Predoi, V., Andela, B., and Hassler, B.: Reliable and reproducible Earth System Model data analysis with ESMValTool, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21771, https://doi.org/10.5194/egusphere-egu26-21771, 2026.

X4.84
|
EGU26-22238
Suze Kundu, Anna Kelbert, Jennifer Lynn Bartlett, and Alberto Accomazzi

Earth and space scientists increasingly work across disciplinary, institutional, and national boundaries, drawing on diverse data sources, tools, and communities. While research data infrastructures (RDIs) have made significant progress in enabling access to data, researchers still face fragmentation across platforms, uneven user experiences, and barriers to interdisciplinary discovery and collaboration.

SciX is NASA’s evolving research discovery and collaboration platform, building on the long-established success of the Astrophysics Data System (ADS) to serve the full breadth of NASA’s Science Mission Directorate. In this contribution, we describe SciX as a user-centred research infrastructure that connects people, research outputs, funding opportunities, and Open Science resources across Earth, planetary, heliophysics, and astrophysics communities. Rather than replacing domain-specific data centres, SciX aims to complement existing infrastructures by improving discoverability, interoperability at the metadata and knowledge level, and cross-disciplinary navigation of the research landscape.

We reflect on the opportunities and challenges of scaling an infrastructure with a strong disciplinary identity (ADS) into a broader, transdisciplinary platform. This includes balancing community-specific needs with shared services, supporting FAIR and Open Science practices in ways that are meaningful to researchers, and fostering cultural change through community engagement rather than top-down mandates. Drawing on early use cases and community feedback, we discuss how SciX is addressing user needs, sustainability, and governance while enabling new forms of interdisciplinary connection.

We conclude by outlining lessons learned for the design of sustainable RDIs and invite dialogue with the Earth System Science community on how infrastructures like SciX can better support collaborative, open, and societally relevant research across domains.

How to cite: Kundu, S., Kelbert, A., Bartlett, J. L., and Accomazzi, A.: SciX: Scaling Research Discovery and Collaboration Across Earth andSpace Science Infrastructures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22238, https://doi.org/10.5194/egusphere-egu26-22238, 2026.

Posters virtual: Mon, 4 May, 14:00–18:00 | vPoster spot 1b

The posters scheduled for virtual presentation are given in a hybrid format for on-site presentation, followed by virtual discussions on Zoom. Attendees are asked to meet the authors during the scheduled presentation & discussion time for live video chats; onsite attendees are invited to visit the virtual poster sessions at the vPoster spots (equal to PICO spots). If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access the Zoom meeting appears just before the time block starts.
Discussion time: Mon, 4 May, 16:15–18:00
Display time: Mon, 4 May, 14:00–18:00
Chairperson: Filippo Accomando

EGU26-13270 | Posters virtual | VPS21

Integrating Participatory Perception-Mapping Data and Stochastic Image Analysis for Urban Landscape Assessment 

Stavroula Kopelia, Nikos Tepetidis, Julia Nerantzia Tzortzi, G.-Fivos Sargentis, and Romanos Ioannidis
Mon, 04 May, 14:06–14:09 (CEST)   vPoster spot 1b

Modern digital technologies and geoinformatics have experienced rapid growth, offering powerful tools to bridge the gap between scientific communities and society in landscape assessment and mapping. This research details the application of a crowdsourcing scheme that utilizes a dedicated mobile application to facilitate direct public participation in quantifying perceptions of urban landscapes and architecture. Initially developed as an educational tool, the methodology has been tested by university students across Italy, Greece, and France, providing a foundational phase for assessing landscape quality and urban typologies. Building upon these educational pilot studies, the work explores the evolution of this methodology into a broader, multicultural citizen science initiative designed to improve the quality and quantity of available landscape perception data.

A significant technical advancement in this research involves the integration of automated image analysis to process the novel data generated by participants from any location. The photographic material was examined using stochastic image analysis based on climacograms, in which images are treated as two-dimensional grayscale intensity fields and analyzed across multiple spatial scales. The method enables the comparison of image patterns based on the visual complexity of the uploaded photographs. A primary challenge addressed was the algorithm's performance when processing real-world, non-curated smartphone images. The analysis began an assessment on how the methodology handles environmental noise, such as sky, trees, and unconventional capture angles, which are inherent to bottom-up crowdsourcing schemes.

The early results indicate that the method can reveal group-level tendencies associated with differing architectural characteristics, particularly in relation to visual complexity, while not supporting reliable classification at the level of individual image. In detail, the findings indicate a trend towards two categorizations: firstly, between modernist-type movements, characterized by minimal elements, and secondly between eclectic or decorative movements, which exhibited higher measured complexity; however, this this behaviour was not observed universally on all analyzed movements The stochastic analysis also indicated theoretical overlaps between certain movements, such as Postmodernism and Eclecticism, based on shared decorative patterns. While the results highlight that environmental factors can influence the analysis of individual photographs, the method utilized presents potential for distinguishing movement trends with logical consistency even from unfiltered data.

Scientifically, this yield of quantitative data sets the groundwork for improved research in the humanities and culture, showing a strong correlation with established landscape quality indices. Socially, the project provides a scalable model for participatory mapping that fosters critical thinking about urban quality, creating new conditions for communication between universities and the broader public. Overall, the presented work reports on the early-stage results of this methodological exploration and aims to evaluate the combined use of participatory mobile data collection and exploratory image-based analysis for landscape and architectural studies, while identifying key challenges related to data quality, interpretation, and future methodological refinement.

How to cite: Kopelia, S., Tepetidis, N., Tzortzi, J. N., Sargentis, G.-F., and Ioannidis, R.: Integrating Participatory Perception-Mapping Data and Stochastic Image Analysis for Urban Landscape Assessment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13270, https://doi.org/10.5194/egusphere-egu26-13270, 2026.

Posters virtual: Wed, 6 May, 14:00–18:00 | vPoster spot 1b

The posters scheduled for virtual presentation are given in a hybrid format for on-site presentation, followed by virtual discussions on Zoom. Attendees are asked to meet the authors during the scheduled presentation & discussion time for live video chats; onsite attendees are invited to visit the virtual poster sessions at the vPoster spots (equal to PICO spots). If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access the Zoom meeting appears just before the time block starts.
Discussion time: Wed, 6 May, 16:15–18:00
Display time: Wed, 6 May, 14:00–18:00
Chairperson: Andrea Barone

EGU26-22048 | Posters virtual | VPS22

Virtual Research Environment initiatives as part of ODATIS, the French Ocean data cluster 

Cyril Germineaud, Gwenael Caer, and Jean-François Piollé
Wed, 06 May, 14:27–14:30 (CEST)   vPoster spot 1b

As part of the French Ocean data cluster ODATIS (from the Data Terra Research Infrastructure), we will showcase the Virtual Research Enviroment (VRE) tools and services offered by CNES and Ifremer. In particular, we will present the CNES JupyterHub platform for hosting projects (high computing power with CPU and GPU capacities, very fast and optimized remote access to data products, etc.) together with altimetry specific Pangeo-based libraries, powerful tools, dedicated tutorials to illustrate simple use cases (intercomparison with different satellite data, cyclone monitoring, coastal water quality applications, etc.) and a technical support (helpdesk) for smooth sailing on the platform. In addition, the synergy between satellite and in-situ data will be also illustrated for several applications, such as surface currents, and comparisons between (BGC-)Argo profiling float observations and satellite matchups.

How to cite: Germineaud, C., Caer, G., and Piollé, J.-F.: Virtual Research Environment initiatives as part of ODATIS, the French Ocean data cluster, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-22048, https://doi.org/10.5194/egusphere-egu26-22048, 2026.

Please check your login data.