Logo image
Lossy Compression of Scientific Data: Applications Constrains and Requirements
Preprint   Open access

Lossy Compression of Scientific Data: Applications Constrains and Requirements

Franck Cappello, Allison Baker, Ebru Bozda, Martin Burtscher, Kyle Chard, Sheng Di, Paul Christopher O'Grady, Peng Jiang, Shaomeng Li, Erik Lindahl, …
ArXiV.org
Cornell University
03/25/2025
DOI: 10.48550/arxiv.2503.20031
url
https://doi.org/10.48550/arxiv.2503.20031View
Preprint (Author's original)This preprint has not been evaluated by subject experts through peer review. Preprints may undergo extensive changes and/or become peer-reviewed journal articles. Open Access

Abstract

Increasing data volumes from scientific simulations and instruments (supercomputers, accelerators, telescopes) often exceed network, storage, and analysis capabilities. The scientific community's response to this challenge is scientific data reduction. Reduction can take many forms, such as triggering, sampling, filtering, quantization, and dimensionality reduction. This report focuses on a specific technique: lossy compression. Lossy compression retains all data points, leveraging correlations and controlled reduced accuracy. Quality constraints, especially for quantities of interest, are crucial for preserving scientific discoveries. User requirements also include compression ratio and speed. While many papers have been published on lossy compression techniques and reference datasets are shared by the community, there is a lack of detailed specifications of application needs that can guide lossy compression researchers and developers. This report fills this gap by reporting on the requirements and constraints of nine scientific applications covering a large spectrum of domains (climate, combustion, cosmology, fusion, light sources, molecular dynamics, quantum circuit simulation, seismology, and system logs). The report also details key lossy compression technologies (SZ, ZFP, MGARD, LC, SPERR, DCTZ, TEZip, LibPressio), discussing their history, principles, error control, hardware support, features, and impact. By presenting both application needs and compression technologies, the report aims to inspire new research to fill existing gaps.
Computer Science - Computational Engineering, Finance, and Science Physics - Instrumentation and Methods for Astrophysics

Details

Metrics

11 Record Views
Logo image