Logo image
cuSZp2: A GPU Lossy Compressor with Extreme Throughput and Optimized Compression Ratio
Conference proceeding

cuSZp2: A GPU Lossy Compressor with Extreme Throughput and Optimized Compression Ratio

Yafan Huang, Sheng Di, Guanpeng Li and Franck Cappello
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, pp.1-18
ACM Conferences
SC '24: The International Conference for High Performance Computing, Networking, Storage, and Analysis
11/17/2024
DOI: 10.1109/SC41406.2024.00021

View Online

Abstract

Existing GPU lossy compressors suffer from expensive data movement overheads, inefficient memory access patterns, and high synchronization latency, resulting in limited throughput. This work proposes cuSZp2, a generic single-kernel error-bounded lossy compressor purely on GPUs designed for applications that require high speed, such as large-scale GPU simulation and large language model training. In particular, cuSZp2 proposes a novel lossless encoding method, optimizes memory access patterns, and hides synchronization latency, achieving extreme end-to-end throughput and optimized compression ratio. Experiments on NVIDIA A100 GPU with 9 real-world HPC datasets demonstrate that, even with higher compression ratios and data quality, cuSZp2 can deliver on average 332.42 and 513.04 GB/s end-to-end throughput for compression and decompression, respectively, which is around 2× of existing pure-GPU compressors and 200× of CPU-GPU hybrid compressors.
Computer systems organization Computer systems organization -- Architectures Computer systems organization -- Architectures -- Parallel architectures Computer systems organization -- Architectures -- Parallel architectures -- Single instruction, multiple data Computing methodologies Computing methodologies -- Computer graphics Computing methodologies -- Computer graphics -- Graphics systems and interfaces Computing methodologies -- Computer graphics -- Graphics systems and interfaces -- Graphics processors Computing methodologies -- Computer graphics -- Image compression Information systems Information systems -- Data management systems Information systems -- Data management systems -- Data structures Information systems -- Data management systems -- Data structures -- Data layout Information systems -- Data management systems -- Data structures -- Data layout -- Data compression Theory of computation Theory of computation -- Design and analysis of algorithms Theory of computation -- Design and analysis of algorithms -- Data structures design and analysis Theory of computation -- Design and analysis of algorithms -- Data structures design and analysis -- Data compression

Details

Metrics

22 Record Views
Logo image