CACHE PROFILING AND THE SPEC BENCHMARKS - A CASE-STUDY

被引:58
作者
LEBECK, AR [1 ]
WOOD, DA [1 ]
机构
[1] UNIV WISCONSIN,DEPT ELECT & COMP ENGN,MADISON,WI 53706
基金
美国国家科学基金会;
关键词
D O I
10.1109/2.318580
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As VLSI technology improvements continue to widen the gap between processor and main memory cycle times, cache performance becomes increasingly important to overall system performance. Cache memories help alleviate the cycle-time disparity, although only for programs that exhibit sufficient spatial and temporal locality. Programs with unruly access patterns consume a lot of time transferring data to and from the cache. To fully exploit the performance potential of fast processors, programmers must explicitly consider cache behavior, restructuring their codes to increase locality. As these fast processors proliferate, techniques for improving cache performance must move beyond the supercomputer and multiprocessor communities and into the mainstream of computing. In this article, the authors examine some of the techniques programmers can use to improve cache performance. They show how to use CProf, a cache profiler, to identify cache performance bottlenecks and gain insight into their origin. This insight helps programmers understand which of the well-known program transformations are likely to improve cache performance. Using CProf and simple transformations, they show how to tune the cache performance of six of the SPEC92 benchmarks. By restructuring the source code, the benchmarks greatly improve cache behavior and achieve execution time speedups ranging from 1.02 to 3.46. The speedup depends on the machine's memory system, with greater speedups obtained in the Fortran programs.
引用
收藏
页码:15 / &
相关论文
共 12 条
[1]  
CALLAHAN D, 1990, INSTRUMENTATION VISU
[2]  
Goldberg A. J., 1991, Proceedings Supercomputing '91 (Cat. No.91CH3058-5), P481, DOI 10.1145/125826.126075
[3]   AN EXECUTION PROFILER FOR MODULAR PROGRAMS [J].
GRAHAM, SL ;
KESSLER, PB ;
MCKUSICK, MK .
SOFTWARE-PRACTICE & EXPERIENCE, 1983, 13 (08) :671-685
[4]  
GUPTA A, 1992, PERFORM EVALUATION, V20, P1
[5]   EVALUATING ASSOCIATIVITY IN CPU CACHES [J].
HILL, MD ;
SMITH, AJ .
IEEE TRANSACTIONS ON COMPUTERS, 1989, 38 (12) :1612-1630
[6]   PAGE PLACEMENT ALGORITHMS FOR LARGE REAL-INDEXED CACHES [J].
KESSLER, RE ;
HILL, MD .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1992, 10 (04) :338-359
[7]  
LAM MS, 1991, 4TH P INT C ARCH SUP, P63
[8]  
PNEVMATIKOS DN, 1990, ACM SIGARCH COMPUTER, V18, P53
[9]  
Porterfield A. K., 1989, THESIS RICE U
[10]  
SMITH AJ, 1982, ACM COMPUT SURV, V14, P473, DOI DOI 10.1145/356887.356892