Logo image
PEPPA-X: Finding Program Test Inputs to Bound Silent Data Corruption Vulnerability in HPC Applications
Conference proceeding   Open access

PEPPA-X: Finding Program Test Inputs to Bound Silent Data Corruption Vulnerability in HPC Applications

Md Hasanur Rahman, Aabid Shamji, Shengjian Guo and Guanpeng Li
SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
International Conference for High Performance Computing Networking Storage and Analysis
SC '21: The International Conference for High Performance Computing, Networking, Storage and Analysis (St. Louis, Missouri, 11/14/2021–11/19/2021)
01/01/2021
DOI: 10.1145/3458817.3476147
url
https://doi.org/10.1145/3458817.3476147View
Published (Version of record) Open Access

Abstract

Transient hardware faults have become prevalent due to the shrinking size of transistors, leading to silent data corruptions (SDCs). Therefore, I IPC applications need to be evaluated (e.g., via fault injections) and protected to meet the reliability target. In the evaluation, the target programs exercise with a set of given aunts which are usually from program benclunark suite. However, these inputs rarely manifest the SDC, vulnerabilities, leading to over-optimistic assessment and unexpectedly higher failure rates in production. We propose PEPPA-X, which efficiently identifies the test inputs that estimate the bound of program SDC resiliency. Our key insight is that the SDC sensitivity distribution in a program often remains stationary across input space. Thereby, we can guide the search of SDC-bound inputs by a sampled distribution. Our evaluation shows that PErrA-X can identify the SDC-bound input of a program that existing methods cannot find even with 5x more search time.
High Performance Computing input fuzzing fault injection program analysis error resilience software testing error propagation silent data corruption UIOWA OA Agreement

Details

Metrics

Logo image