TR-02-3.pdf

TPF: A System Thrashing Protection Facility

Song Jiang and Xiaodong Zhang 
Software: Practice and Experience, Vol. 32, Issue 3, 2002, pp. 295-318.  

Abstract

Operating system designers attempt to keep high CPU utilization by
maintaining an optimal multiprogramming level (MPL). Although running
more processes makes it less likely to leave the CPU idle, too many
processes adversely incur serious memory competition, and even introduce
thrashing, which eventually lowers CPU utilization. A common practice
to address the problem is to lower the MPL with the aid of process
swapping out/in operations. This approach is expensive and is only used
when the system begins serious thrashing.  The objective of our study
is to provide highly responsive and cost-effective thrashing protection
by adaptively conducting priority page replacement in a timely manner.
We have designed a dynamic system Thrashing Protection Facility (TPF) in
the system kernel.  Once TPF detects system thrashing, one of the active
processes will be identified for protection. The identified process will
have a short period of privilege in which it does not contribute its LRU
pages for removal so that the process can quickly establish its working
set, improving the CPU utilization.  With the support of TPF, thrashing
can be eliminated in its early stage by adaptive page replacement, so that
process swapping will be avoided or delayed until it is truly necessary.

We have implemented TPF in a current and representative Linux kernel
running on an Intel Pentium machine.  Compared with the original Linux
page replacement, we show that TPF consistently and significantly reduces
page faults and the execution time of each individual job in several
groups of interacting SPEC CPU2000 programs.  We also show that TPF
introduces little additional overhead to program executions, and its
implementation in Linux (or Unix) systems is straightforward.