My research interests are in distributed systems, in particular fault
tolerance, scalability, and performance optimization.
My CV can be found here.
wPerf [paper, source code]:
this work builds a tool to identify waiting events that are limiting the maximal throughput
of a multi-threaded application. To achieve this goal, wPerf first computes how a waiting event
can affect threads directly waiting for this event; then wPerf builds a wait-for graph to compute whether
such impact can indirectly reach other threads. By combining these two techniques, wPerf essentially
tries to identify events with large impacts on all threads.
SafeTimer [paper, source code]:
this work enhances existing timeout detection protocols to tolerate long delays
in the OS and the application. At the heartbeat receiver, SafeTimer checks whether there
are any pending heartbeats before reporting a failure; at the heartbeat sender, SafeTimer
blocks the sender if it cannot send out heartbeats in time. We have proved that SafeTimer
can prevent false failure report despite arbitrary delays in the OS and the application.
This property allows existing protocols to relax their timing assumptions and use a shorter
timeout interval for faster failure detection.
Hadoop metadata benchmark [paper, source code]:
this work builds benchmarks to test HDFS NameNode and Yarn Resource Manager by running real experiments in a small testbed,
collecting the traces, and extrapolating such traces to a larger scale.
ThriftyPaxos [paper, source code]:
standard Paxos needs 2f+1 replicas to tolerate f+1 failures. To reduce cost, ThriftyPaxos activates f+1
replicas first and activates backup ones when active ones fail. To ensure system availability
when copying data to the newly activated replica, ThriftyPaxos logically separates agreement
and execution and exploits a unique property from each: agreement only needs to decide what
is the next request, which allows
a blank agreement node to join the protocol instantly; execution requires only 1 node to reply
when processing a client's request, but requires f+1 nodes to reply when take a snapshot, which
means when we have fewer than f+1 replicas, we can still process clients' requests and only need
to delay the time-insensitive snapshot task.
"Redio: Accelerating Disk-based Graph Processing by Reducing Disk I/Os", Chengwen Wu, Guangyan Zhang, Yang Wang, Xinyang Jiang, and Weimin Zheng, accepted by IEEE Transactions on Computers.