Fine-grain Priority Scheduling on Multi-channel Memory Systems

Zhichun Zhu, Zhao Zhang, and Xiaodong Zhang 
Proceedings of the 8th International Symposium on High Performance
Computer Architecture, (HPCA-8), Cambridge, MA, February 2-6, 2002, 
pp. 107-116. 


Configurations of contemporary DRAM memory systems become increasingly
complex.  A recent study shows that application performance is highly
sensitive to choices of configurations, and suggests that tuning burst
sizes and channel configurations be an effective way to optimize the
DRAM performance for a given memory-intensive workload.  However,
this approach is workload dependent.  In this study we show that, by
utilizing fine-grain priority access scheduling, we are able to find a
workload independent configuration that achieves optimal performance
on a multi-channel memory system.  Our approach can well utilize the
available high concurrency and high bandwidth on such memory systems,
and effectively reduce the memory stall time of memory-intensive
applications.  Conducting execution-driven simulation of a 4-way issue,
2 GHz processor, we show that the average performance improvement
for fifteen memory-intensive SPEC2000 programs by using an optimized
fine-grain priority scheduling is about 13% and 8% for a 2-channel and
a 4-channel Direct Rambus DRAM memory systems, respectively, compared
with gang scheduling.  Compared with burst scheduling, the average
performance improvement is 16% and 14% for the 2-channel and 4-channel
memory systems, respectively.