Computing Community Consortium Highlights YSmart


The Computing Community Consortium chose the YSmart software tool for its weekly notable research. The Computing Community Consortium, established in 2006 by the National Science Foundation, is "a voice for the national computing research community. The CCC facilitates the development of a bold, multi-themed vision for computing research and communicates that vision to a wide range of major stakeholders."

YSmart software is a tool designed to improve the productivity of big data processing. It can automatically translate SQL queries to MapReduce programs running the Hadoop platform. The elegance of YSmart is in its ability to automatically detect and utilize intra-query correlations when translating a complex SQL query to a series of MapReduce programs. It has been adopted by Apache Hive which is used by organizations such as Facebook, LinkedIn, Microsoft, Netflix and Taobao.

"I am glad to see that the industry has quickly adopted another research from us for big data processing software ecosystems after adopting RCFile, a data placement structure for storing big data in distributed systems we developed in collaboration with software engineers of Facebook," said Xiaodong Zhang, the principal investigator of the big data research project, and the Robert M. Chritchfield Professor in Engineering and Chair of the CSE Department. "This is an exciting time to do research on big data, which makes a direct impact on technology advancement to benefit the society."

The YSmart team includes Buckeyes Yin Huai, Rubao Lee, Tian Luo, Meisam Fathi Salmi,Yuan Yuan, and Xiaodong Zhang, as well outside collaborators, Yongqiang He of Facebook and Fusheng Wang of Emory University.