OSU Software `Souping-Up' Supercomputers


The MVAPICH software of Dr. D.K. Panda's team is driving the #5 Supercomputer in the world!

Top500 Project has announced it's newest list of the 500 most powerful supercomputers worldwide. Sandia National Laboratory's "Thunderbird," a 4000-node (8,000-processor) cluster computer, is placed at #5. Thunderbird is the top 'commodity cluster in the world' using Dell servers, InfiniBand interconnect and MVAPICH software, an OSU research product, to get this ranking, which is the highest rank to date for Sandia Labs.

Dr. D. K. Panda's team developed MVAPICH software in 2002. Since then they have been adding new features and enhancing the performance of this software in a continuous manner. MVAPICH, pronounced "em-vah-pich," is short for "MPI for InfiniBand on VAPI Layer." The software works by connecting traditional supercomputing application software (written with the Message Passing Interface (MPI) standard) with InfiniBand networking technology speeding up the data flow and computation. As the supercomputers have become less of the ENIAC-like mammoths and are now built around clusters of individual PCs/servers called "nodes", the need for high performance communication between the nodes using commodity networks becomes a greater priority. According to Dr. Panda, it is in this transition that MVAPICH makes it's impact.

"At some point, adding nodes to a cluster doesn't make the calculations go any faster, because it introduces communication and synchronization overheads, and researchers have to rely on software to manage communication between nodes effectively. MVAPICH takes that software a step further by connecting it with the emerging InfiniBand network technology."

MVAPICH (and MVAPICH2 - the counterpart version for the MPI-2 standard) is currently being used by more than 280 organizations worldwide. The other MVAPICH users on the Top500 list are: 20th - Virginia Tech (1100-node dual Apple Xserve 2.3 GHz cluster); 51st - Univ. of Sherbrooke in Canada (576-node dual Intel Xeon EM64T 3.6 GHz cluster); 277th - SARA in The Netherlands (272-node dual Intel Xeon EM64T 3.4 GHz cluster); and 305th - NERSC/LBNL (315-node dual Opteron 2.2 GHz cluster).

A version of MVAPICH is also available in the emerging OpenIB/Gen2 stack (http://www.openib.org) in an integrated manner. This is enabling a large number of users and Linux Distros to download the entire InfiniBand stack (including MVAPICH) and exploit the benefits of InfiniBand and MVAPICH on clusters. This OpenIB/Gen2 stack is gaining momentum in InfiniBand and High Performance Computing communities to harness the capability of InfiniBand. This stack together with MVAPICH was used by more than 30 vendors (connected through SCINet-InfiniBand) on the exhibition floor of the Supercomputing '05 conference in Seattle (where the TOP500 list was announced).

According to their web site, the Top500 project "was started in 1993 to provide a reliable basis for tracking and detecting trends in high-performance computing. Twice a year, a list of the sites operating the 500 most powerful computer systems is assembled and released. The best performance on the <http://www.top500.org/lists/linpack.php>Linpack benchmark is used as
performance measure for ranking the computer systems. The list contains a variety of information including the system specifications and its major application areas."

More details on the MVAPICH project can be obtained from http://nowlab.cse.ohio-state.edu/projects/mpi-iba/index.html