Session ESSI2.2

ESSI2.2 | High-performance computation with big data in the geosciences

High-performance computation with big data in the geosciences

Co-organized by HS13

Convener: Kor de JongECSECS | Co-conveners: Juniper TyreeECSECS, Clément BouvierECSECS, Daniel Caviedes-Voullième, Arnau Folch, Corentin Carton de Wiart

Spatio-temporal Earth System Science (ESS) datasets are constantly growing in size, particularly those generated by high-resolution numerical models, due to increases in extent and resolution. Because of this, existing software to read, store, and write datasets, and translate the data may not be able to perform the work in a timely manner anymore, while future investment in hardware is likely to remain constrained. This limits the potential of numerical simulation models and machine learning models, for example. However, these models and the larger datasets they produce are essential for advancing ESS, supporting critical activities such as climate change policymaking, weather forecasting in the face of increasingly frequent natural disasters, and modern applications like machine learning.

In this session we bring together researchers working on novel software for processing and compressing large spatio-temporal datasets. By presenting their work to their colleagues, we aim to further strengthen the field of high-performance computation with big data in the geosciences.

We invite everybody recognizing the problem and working on ways to solve it to participate in this session. Possible topics include, but are not limited to:

- High-performance computing, parallel computing, distributed computing, cloud computing, asynchronous computing, accelerated computing, green computing
- Algorithms, libraries, frameworks
- Parallel I/O, data models, data formats, data cubes, HDF5, netCDF, Zarr, COG
- Data compression, including methods that provide guarantees for lossy compression
- Containerization, Docker, Kubernetes, Singularity, Apptainer
- Physically based modelling, physics informed machine learning, surrogate modelling
- Model coupling, model workflow management
- Large scale hydrology, remote sensing, climate modelling
- Lessons learned from case-studies

We recommend authors to highlight those (generic) aspects of their work that may be especially of interest to their colleagues.

Solicited authors:
Langwen Huang

To learn more about data compression and try out different compressors in practice, please also join the SC2.5 short course.