Optimal prefetching via data compression

被引:66
作者
Vitter, JS [1 ]
Krishnan, P [1 ]
机构
[1] AT&T BELL LABS, HOLMDEL, NJ 07733 USA
关键词
caching; competitive analysis; databases; data compression; fault rate; hypertext; Markov source; prediction; prefetching; secondary stage; universal prefetcher;
D O I
10.1145/234752.234753
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Caching and prefetching are important mechanisms for speeding up access time to data on secondary storage. Recent work in competitive online algorithms has uncovered several promising new algorithms for caching. In this paper, we apply a form of the competitive philosophy for the first time to the problem of prefetching to develop an optimal universal prefetcher in terms of fault rate, with particular applications to large-scale databases and hypertext systems. Our prediction algorithms for prefetching are novel in that they are based on data compression techniques that are both theoretically optimal and good in practice. Intuitively, in order to compress data effectively, you have to be able to predict future data well, and thus good data compressors should be able to predict well for purposes of prefetching. We show for powerful models such as Markov sources and mth order Markov sources that the page fault rates incurred by our prefetching algorithms are optimal in the limit for almost all sequences of page requests.
引用
收藏
页码:771 / 793
页数:23
相关论文
共 38 条
[1]  
ABE N, 1990, UCSCCRL9063
[2]  
AMIT Y, 1990, LARGE DEVIATIONS COD
[3]  
[Anonymous], 1982, ESTIMATION DEPENDENC
[4]  
Bell T. C., 1990, TEXT COMPRESSION
[5]  
Blackwell D., 1956, PAC J MATH, V6, P1, DOI [DOI 10.2140/PJM.1956.6.1, 10.2140/pjm.1956.6.1]
[6]   OCCAM RAZOR [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
INFORMATION PROCESSING LETTERS, 1987, 24 (06) :377-380
[7]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965
[8]  
BOARD R, 1990, PROCEEDINGS OF THE TWENTY SECOND ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, P54, DOI 10.1145/100216.100223
[9]  
BORODIN A, 1991, 23RD P ANN ACM S THE, P249
[10]  
BRADY JT, 1986, IEEE CG A MAY, P25