SC2.11 | Supporting the understanding and reuse of reproducible analysis workflows
Supporting the understanding and reuse of reproducible analysis workflows
Co-organized by ESSI6/HS11/NP9
Convener: Markus KonkolECSECS | Co-conveners: Sadra Matmir, Merret Buurman

The EU-funded project AquaINFRA (https://aquainfra.eu/) aims to help marine and freshwater researchers restore healthy oceans, seas, coastal and inland waters. To achieve this goal, a large part of the work is dedicated to designing and implementing a research data infrastructure composed of the AquaINFRA Interaction Platform (AIP) and the Virtual Research Environment (VRE). This effort is part of the ongoing development of the European Open Science Cloud (EOSC) as an overarching research infrastructure, the EU flagship initiative to enable Open Science practices in Europe.

The AIP is the central gateway for scientific communities to find, access, and reuse aquatic digital resources such as FAIR multi-disciplinary data and analysis workflows. The basis for this is the Data Discovery and Access service which performs a live query to a number of data providers from the aquatic realm, for instance, Copernicus Marine and HELCOM. The data found can be used in the VRE, which is composed of a web API service hosting a number of OGC API Processes, a virtual lab based on the tool MyBinder, and the Galaxy platform as a workflow management system.

In this short course, we will start with providing an overview of the research data infrastructure. Then, we will show how the AIP and VRE can help to find data and use it in the Galaxy platform to create reproducible and readily-shareable analysis workflows. We will use a hydrological demonstrator in the form of a Data-to-Knowledge Package (D2K-Package) for this purpose [1]. The D2K-Package is a collection of links to digital research assets, including data, containerized code enriched by the computational environment, virtual labs, OGC API Processes, and computational workflows.

Although we will use a hydrological demonstrator, the course is not limited to hydrologists but open to everyone interested in making computational research more reusable. To follow this course, the attendees will need to register on Galaxy (https://usegalaxy.eu/login/start). We kindly ask the attendees to do so in advance to avoid delay. No prior knowledge in Hydrology or Galaxy is required to follow this course. Some understanding of scripting languages (e.g., R) can be helpful but the basic concepts do not depend on a particular technology.

Konkol, M. et al. (2025). Encouraging reusability of computational research through Data-to-Knowledge Packages - A hydrological use case https://doi.org/10.12688/openreseurope.20221.2.

Please check your login data.