Compiler-based I/O prefetching for out-of-core applications

被引:43
作者
Brown, AD
Mowry, TC
Krieger, O
机构
[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
[2] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2001年 / 19卷 / 02期
关键词
performance; design; experimentation; virtual memory; compiler optimization; prefetching;
D O I
10.1145/377769.377774
中图分类号
TP301 [理论、方法];
学科分类号
081202 [计算机软件与理论];
摘要
Current operating systems offer poor performance when a numeric application's working set does not fit in main memory. As a result, programmers who wish to solve "out-of-core" problems efficiently are typically faced with the onerous task of rewriting an application to use explicit I/O operations (e.g., read/write). In this paper, we propose and evaluate a fully automatic technique which liberates the programmer from this task, provides high performance, and requires only minimal changes to current operating systems. In our scheme the compiler provides the crucial information on future access patterns without burdening the programmer; the operating system supports nonbinding prefetch and release hints for managing I/O; and the operating system cooperates with a run-time layer to accelerate performance by adapting to dynamic behavior and minimizing prefetch overhead. This approach maintains the abstraction of unlimited virtual memory for the programmer, gives the compiler the flexibility to aggressively insert prefetches ahead of references, and gives the operating system the flexibility to arbitrate between the competing resource demands of multiple applications. We implemented our compiler analysis within the SUIF compiler, and used it to target implementations of our run-time and OS support on both research and commercial systems (Hurricane and IRIX 6.5, respectively). Our experimental results show large performance gains for out-of-core scientific applications on both systems: more than 50% of the I/O stall time has been eliminated in most cases, thus translating into overall speedups of roughly twofold in many cases.
引用
收藏
页码:111 / 170
页数:60
相关论文
共 43 条
[1]
[Anonymous], P 24 INT S COMP ARCH
[2]
[Anonymous], P 15 ACM S OP SYST P
[3]
Bailey David, 1991, RNR91002
[4]
Brown AD, 2000, USENIX ASSOCIATION PROCEEDINGS OF THE FOURTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P31
[5]
CAO P, 1995, P 1995 ACM SIGMETRIC, P188
[6]
CHANG F, 1999, P 3 USENIX S OP SYST
[7]
RAID - HIGH-PERFORMANCE, RELIABLE SECONDARY STORAGE [J].
CHEN, PM ;
LEE, EK ;
GIBSON, GA ;
KATZ, RH ;
PATTERSON, DA .
ACM COMPUTING SURVEYS, 1994, 26 (02) :145-185
[8]
COLVIN A, 1998, P 3 INT WORKSH HIGH
[9]
CRANDALL PE, 1995, P 1995 C SUP CD ROM
[10]
CUREWITZ K, 1993, P 1993 ACM SIGMOD IN, P43