TR-18-3.pdf

``A low-cost disk solution enabling LSM-tree to achieve high performance for mixed read/writes workloads"   

Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Yanfeng Zhang, Siyuan Ma, and Xiaodong Zhang 

ACM Transactions on Storage, Vol. 14, No. 2, April 2018.  


Abstract
LSM-tree has been widely used in data management production systems for write-intensive workloads. However,
as read and write workloads co-exist under LSM-tree, data accesses can experience long latency and low
throughput due to the interferences to buffer caching from the compaction, a major and frequent operation
in LSM-tree. After a compaction, the existing data blocks are reorganized and written to other locations on
disks. As a result, the related data blocks that have been loaded in the buffer cache are invalidated since their
referencing addresses are changed, causing serious performance degradations.

To re-enable high-speed buffer caching during intensive writes, we propose Log-Structured buffered-Merge
tree (simplified as LSbM-tree) by adding a compaction buffer on disks to minimize the cache invalidations on
buffer cache caused by compactions. The compaction buffer efficiently and adaptively maintains the frequently
visited datasets. In LSbM, strong locality objects can be effectively kept in the buffer cache with
minimum or no harmful invalidations. With the help of a small on-disk compaction buffer, LSbM achieves
a high query performance by enabling effective buffer caching, while retaining all the merits of LSM-tree
for write-intensive data processing and providing high bandwidth of disks for range queries. We have implemented
LSbM based on LevelDB.We showthatwith a standard buffer cache and a hard disk, LSbM can achieve
2x performance improvement over LevelDB. We have also compared LSbM with other existing solutions to
show its strong cache effectiveness.
Back to the Publication Page.

Back to the HPCS Main Page at the Ohio State University.