TR-01-3.ps.Z

Dynamic Load Sharing With Unknown Memory Demands in Clusters 

S. Chen, L. Xiao, and X. Zhang 

Proceedings of the 21st International Conference 
on Distributed Computing Systems, (ICDCS'2001), 
Phoenix, Arizona, April 16-19, 2001, pp. 109-118. 

Abstract 

A compute farm is a pool of clustered workstations to provide high
performance computing services for CPU-intensive, memory-intensive,
and I/O active jobs in a batch mode. Existing load sharing schemes 
with memory considerations assume jobs' memory demand sizes are known 
in advance or predictable based on users' hints. This assumption can
greatly simplify the designs and implementations of load sharing schemes,
but is not desirable in practice. In order to address this concern,
we present three new results and contributions in this study.
(1) Conducting Linux kernel instrumentation, we have collected 
different types of workload execution traces to quantitatively characterize
job interactions, and modeled page fault behavior as a function of the
overloaded memory sizes and the amount of jobs' I/O activities.
(2) Based on experimental results and collected dynamic system information,
we have built a simulation model which accurately emulates the memory 
system operations and job migrations with virtual memory considerations.
(3) We have proposed a memory-centric load sharing scheme and its variations
to effectively process dynamic memory allocation demands, aiming at 
minimizing execution time of each individual job by dynamically 
migrating and remotely submitting jobs to eliminate or reduce page faults
and to reduce the queuing time for CPU services. Conducting trace-driven 
simulations, we have examined these load sharing policies to show 
their effectiveness.
Back to the Publication Page.
Back to the HPCS Main Page at the Ohio State University.