``Catfish: adaptive RDMA-enabled R-tree for low latency and high throughput"

Mengbai Xiao, Hao Wang, Liang Geng, Rubao Lee, and Xiaodong Zhang

Proceedings of 39th IEEE International Conference on Distributed Computing Systems (ICDCS 2019)
Dallas, Texas, July 7-9, 2019.

Abstract

R-tree is a foundational data structure used in spatial
databases and scientific databases. With the advancement of
Internet and computer architectures, in-memory data processing
for R-tree in distributed systems has become a common platform.
We have observed new performance challenges to process R-tree
as the amount of multidimensional datasets become increasingly
huge. Specifically, an R-tree server can be heavily overloaded
while the network and client CPU are lightly loaded, and vice
versa.

In this paper, we present the design and implementation
of Catfish, an RDMA enabled R-tree for low latency and
high throughput by adaptively utilizing the available network
bandwidth and computing resources to balance the workloads
between clients and servers. We design and implement two basic
mechanisms of using RDMA for the client-server R-tree. First,
in the fast messaging design, we use RDMA writes to send
R-tree requests to the server and let server threads process
R-tree requests to achieve low query latency. Second, in the
RDMA offloading design, we use RDMA reads to offload tree
traversal from the server to the client, which rescues the server
as it is overloaded. We further develop an adaptive scheme
to effectively switch an R-tree search between fast messaging
and RDMA offloading, maximizing the overall performance.
Our experiments show that the adaptive solution of Catfish
on InfiniBand significantly outperforms R-tree that uses only
fast messaging or only RDMA offloading in both latency and
throughput. Catfish can also deliver up to one order of magnitude
performance over the traditional schemes using TCP/IP on 1
Gbps and 40 Gbps Ethernet. We make a strong case to use
RDMA to effectively balance workloads in distributed systems
for low latency and high throughput.