Conference proceeding
Rethinking graph data placement for graph neural network training on multiple GPUs
ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing, pp.1-10
ICS '22: 2022 International Conference on Supercomputing (Virtual Event, 06/28/2022–06/30/2022)
06/28/2022
DOI: 10.1145/3524059.3532384
Appears in UI Libraries Support Open Access
Abstract
Graph partitioning is commonly used for dividing graph data for parallel processing. While they achieve good performance for the traditional graph processing algorithms, the existing graph partitioning methods are unsatisfactory for data-parallel GNN training on GPUs. In this work, we rethink the graph data placement problem for large-scale GNN training on multiple GPUs. We find that loading input features is a performance bottleneck for GNN training on large graphs that cannot be stored on GPU. To reduce the data loading overhead, we first propose a performance model of data movement among CPU and GPUs in GNN training. Then, based on the performance model, we provide an efficient algorithm to divide and distribute the graph data onto multiple GPUs so that the data loading time is minimized. For cases where data placement alone cannot achieve good performance, we propose a locality-aware neighbor sampling technique to further reduce the data movement overhead without losing accuracy. Our experiments with graphs of different sizes on different numbers of GPUs show that our techniques not only achieve smaller data loading time but also incur much less preprocessing overhead than the existing graph partitioning methods.
Details
- Title: Subtitle
- Rethinking graph data placement for graph neural network training on multiple GPUs
- Creators
- Shihui Song - University of IowaPeng Jiang - University of Iowa, Computer Science
- Resource Type
- Conference proceeding
- Publication Details
- ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing, pp.1-10
- Conference
- ICS '22: 2022 International Conference on Supercomputing (Virtual Event, 06/28/2022–06/30/2022)
- DOI
- 10.1145/3524059.3532384
- Publisher
- Association for Computing Machinery (ACM)
- Grant note
- name: NSF, award: CCF-2028825
- Language
- English
- Date published
- 06/28/2022
- Academic Unit
- Computer Science
- Record Identifier
- 9984473237902771
Metrics
7 Record Views