Data prefetch mechanisms

被引:153
作者
Vanderwiel, SP
Lilja, DJ
机构
[1] IBM Corp, Server Grp, Syst Architecture Performance & Design, N Rochester, MN 55901 USA
[2] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA
关键词
design; performance; memory latency; prefetching;
D O I
10.1145/358923.358939
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The expanding gap between microprocessor and DRAM performance has necessitated the use of increasingly aggressive techniques designed to reduce or hide the latency of main memory access. Although large cache hierarchies have proven to be effective in reducing this latency for the most frequently used data, it is still not uncommon for many programs to spend more than half their run times stalled on memory requests. Data prefetching has been proposed as a technique for hiding the access latency of data referencing patterns that defeat caching strategies. Rather than waiting for a cache miss to initiate a memory fetch, data prefetching anticipates such misses and issues a fetch to the memory system in advance of the actual memory reference. To be effective, prefetching must be implemented in such a way that prefetches are timely, useful, and introduce little overhead. Secondary effects such as cache pollution and increased memory bandwidth requirements must also be taken into consideration. Despite these obstacles, prefetching has the potential to significantly improve overall program execution time by overlapping computation with memory accesses. Prefetching strategies are diverse, and no single strategy has yet been proposed that provides optimal performance. The following survey examines several alternative approaches, and discusses the design tradeoffs involved when implementing a data prefetch strategy.
引用
收藏
页码:174 / 199
页数:26
相关论文
共 47 条
[1]   Distributed prefetch-buffer cache design for high performance memory systems [J].
Alexander, T ;
Kedem, G .
SECOND INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 1996, :254-263
[2]   PERFORMANCE EVALUATION OF COMPUTING SYSTEMS WITH MEMORY HIERARCHIES [J].
ANACKER, W ;
WANG, CP .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1967, EC16 (06) :764-&
[3]  
[Anonymous], 1991, P ACM IEEE C SUP SUP
[4]  
[Anonymous], P 8 ANN S COMP ARCH
[5]  
[Anonymous], P 8 ANN S COMP ARCH
[6]  
Bernstein D., 1995, Parallel Architectures and Compilation Techniques. Proceedings of the IFIP WG10.3 Working Conference. PACT'95, P19
[7]   Limited bandwidth to affect processor design [J].
Burger, D ;
Goodman, JR ;
Kagi, A .
IEEE MICRO, 1997, 17 (06) :55-62
[8]  
CALLAHAN D, 1991, SIGARCH COMPUT ARCH, V19, P40
[9]  
Casmira J. P., 1995, Proceedings of the IASTED International Conference. Modelling and Simulation, P123
[10]  
Chan KK, 1996, HEWLETT-PACKARD J, V47, P25