Conference proceeding
CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, pp.309-321
ACM Conferences
HPDC '24: 33rd International Symposium on High-Performance Parallel and Distributed Computing
06/03/2024
DOI: 10.1145/3625549.3658691
Abstract
Today's scientific applications running on supercomputers produce large volumes of data, leading to critical data storage and communication challenges. To tackle the challenges, error-bounded lossy compression is commonly adopted since it can reduce data size drastically within a user-defined error threshold. Previous work has shown that compression techniques can significantly reduce the storage and I/O overhead while retaining good data quality. However, the existing compressors are mainly designed for CPU and GPU. As new AI chips are being incorporated into supercomputers and increasingly used for accelerating scientific computing, there is a growing demand for efficient data compression on the new architecture. In this paper, we propose an efficient lossy compressor, CereSZ, based on the Cerebras CS-2 system. The compression algorithm is mapped onto Cerebras using both data parallelism and pipeline parallelism. In order to achieve a balanced workload on each processing unit, we propose an algorithm to evenly distribute the pipeline stages. Our experiments with six scientific datasets demonstrate that CereSZ can achieve a throughput from 227.93 GB/s to 773.8 GB/s, 2.43x to 10.98x faster than existing GPU compressors.
Details
- Title: Subtitle
- CereSZ: Enabling and Scaling Error-bounded Lossy Compression on Cerebras CS-2
- Creators
- Shihui Song - University of Iowa, Iowa City, United States of AmericaYafan Huang - University of Iowa, Iowa City, United States of AmericaPeng Jiang - University of Iowa, Iowa City, United States of AmericaXiaodong Yu - Stevens Institute of TechnologyWeijian Zheng - Argonne National LaboratorySheng Di - Argonne National LaboratoryQinglei Cao - Saint Louis UniversityYunhe Feng - University of North Texas at DallasZhen Xie - Binghamton UniversityFranck Cappello - Argonne National Laboratory
- Resource Type
- Conference proceeding
- Publication Details
- Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, pp.309-321
- Conference
- HPDC '24: 33rd International Symposium on High-Performance Parallel and Distributed Computing
- Series
- ACM Conferences
- DOI
- 10.1145/3625549.3658691
- Publisher
- ACM
- Grant note
- U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (ASCR): DE-AC02-06CH11357 National Science Foundation: OAC-2003709, OAC-2104023, OAC-2311875 U.S. DOE Office of Science-Advanced Scientific Computing Research Program: DE-AC02-06CH11357
This research was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (ASCR), under contract DE-AC02-06CH11357, and supported by the National Science Foundation under Grant OAC-2003709, OAC-2104023, OAC-2311875. This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357.
- Language
- English
- Date published
- 06/03/2024
- Academic Unit
- Computer Science
- Record Identifier
- 9984699518202771
Metrics
25 Record Views