``SEP-Graph: ifinding shortest execution paths for graph processing under 
a hybrid framework on GPU"   

Hao Wang, Liang Geng, Rubao Lee, Kaixi Hou, Yanfeng Zhang, Xiaodong Zhang

Proceedings of 24th ACM SIGPLAN Annual Symposium on Principles and Practice 
of Parallel Programming (PPoPP 2019), Washington DC, USA, February 16-20, 2019. 


In general, the performance of parallel graph processing is
determined by three pairs of critical parameters, namely synchronous
or asynchronous execution mode (Sync or Async),
Push or Pull communication mechanism (Push or Pull), and
Data-driven or Topology-driven traversing scheme (DD or
TD), which increases the complexity and sophistication of
programming and system implementation of GPU. Existing
graph-processing frameworks mainly use a single combination
in the entire execution for a given application, but we
have observed their variable and suboptimal performance.

In this paper, we present SEP-Graph, a highly efficient
software framework for graph-processing on GPU. The hybrid
execution mode is automatically switched among three
pairs of parameters, with an objective to achieve the shortest
execution time in each iteration. We also apply a set of
optimizations to SEP-Graph, considering the characteristics
of graph algorithms and underlying GPU architectures. We
show the effectiveness of SEP-Graph based on our intensive
and comparative performance evaluation on NVIDIA
1080, P100, and V100 GPUs. Compared with existing and representative
GPU graph-processing framework Groute and
Gunrock, SEP-Graph can reduce execution time up to 45.8
times and 39.4 times.