Best Student Paper at Cluster '13


Rong Shi is the lead author of the Best Student Paper at the IEEE International Conference on Cluster Computing (Cluster '13) held in Indianapolis, Indiana, USA. This conference is a premier conference highlighting the latest developments in cluster computing technologies and practices.

The winning paper, A Scalable and Portable Approach to Accelerate Hybrid HPL on Heterogeneous CPU-GPU Clusters, presents a simple yet elegant approach for modern clusters to fully utilize all computing resources including all CPU nodes and GPU nodes. High Performance Linpack (HPL) continues to be used as the yardstick for ranking supercomputers around the world. Many clusters, of different scales, are being deployed with only a subset of nodes equipped with NVIDIA GPU accelerators. However, the true peak performance of these clusters is not reported due to the lack of a version of HPL that can take advantage of all the CPU and GPU resources available. The proposed hybrid design is based on fine-grained weighted MPI process distribution to balance the workload between CPU and GPU nodes. The proposed design uses techniques like process reordering to minimize communication overheads and achieve close to peak performance.

Test results on the Oakley Cluster at the Ohio Supercomputer Center show that the proposed design can achieve more than 80% of combined actual peak performance of CPU and GPU nodes. Also, the hybrid design provides 47% and 63% increase in the HPL performance that can be reported using only CPU or only GPU nodes.

The co-authors of the paper with Rong Shi include Sreeram Potluri (CSE Ph.D. candidate), Khaled Hamidouche (CSE Post-doctorate), Xiaoyi Lu (CSE Post-doctorate), Karen Tomko (Senior Researcher, Ohio Supercomputer Center) and Dhabaleswar K. Panda (CSE Professor).

Rong Shi is a Ph.D. student in the Department of Computer Science and Engineering at The Ohio State University. He is a member of the Network-Based Computing Laboratory lead by Prof.D.K.Panda. His research interests include heterogeneous architectures and hybrid Programming models. He received his BS and MS degree in Computer Science and Engineering from University of Electronic Science and Technology of China. Rong is currently involved in the design and development of MVAPICH2, a high-performance open-source implementation of MPI over InfiniBand, 10GigE/iWARP and RoCE interconnects. This software is currently used by over 2,085 organizations in 71 countries.