Efficient Metadata Services

Metadata services such as ZooKeeper and HDFS NameNode serve critical functionalities in today's datacenters. However, many of them are designed as single-server (maybe replicated) instances in the first place for simplicity and they are becoming the bottlenecks as the scale of the whole system grows. While many work start to investigate multi-server designs, sacrificing simplicity for scalability, we hope to investigate how far this single-server design can go. The observation that motivates our design is that while the scale of a whole system grows, the capacity of hardware equipped on a single machine also grows significantly. Such hardware, however, usually perform well when there are many parallel tasks or large bulk of data while metadata services usually process small requests with strict ordering guarantees. This project will investigate how to process such metadata requests efficiently on today's hardware.

This project will include study in the following topics: