Paper ID: 2204.08150

Characterizing and Understanding Distributed GNN Training on GPUs

Haiyang Lin, Mingyu Yan, Xiaocheng Yang, Mo Zou, Wenming Li, Xiaochun Ye, Dongrui Fan

Graph neural network (GNN) has been demonstrated to be a powerful model in many domains for its effectiveness in learning over graphs. To scale GNN training for large graphs, a widely adopted approach is distributed training which accelerates training using multiple computing nodes. Maximizing the performance is essential, but the execution of distributed GNN training remains preliminarily understood. In this work, we provide an in-depth analysis of distributed GNN training on GPUs, revealing several significant observations and providing useful guidelines for both software optimization and hardware optimization.

Submitted: Apr 18, 2022

Topics

Graph Neural Network
Human Understanding
Single GPU
New Characterization
Large Graph
Hardware Design Optimization

Links

arXiv PDF