Fast incremental maintenance of approximate histograms

被引:77
作者
Gibbons, PB
Matias, Y
Poosala, V
机构
[1] Intel Res Pittsburgh, Pittsburgh, PA 15213 USA
[2] Tel Aviv Univ, Dept Comp Sci, IL-69978 Tel Aviv, Israel
[3] Bell Labs, Informat Sci Res Ctr, Murray Hill, NJ 07974 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2002年 / 27卷 / 03期
关键词
algorithms; experimentation; performance; approximation; histograms; incremental maintenance; sampling; query optimization;
D O I
10.1145/581751.581753
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 [计算机科学与技术];
摘要
Many commercial database systems maintain histograms to summarize the contents of large relations and permit efficient estimation of query result sizes for use in query optimizers. Delaying the propagation of database updates to the histogram often introduces errors into the estimation. This article presents new sampling- based approaches for incremental maintenance of approximate histograms. By scheduling updates to the histogram based on the updates to the database, our techniques are the first to maintain histograms effectively up to date at all times and avoid computing overheads when unnecessary. Our techniques provide highly accurate approximate histograms belonging to the equidepth and Compressed classes. Experimental results show that our new approaches provide orders of magnitude more accurate estimation than previous approaches. An important aspect employed by these new approaches is a backing sample, an up- to- date random sample of the tuples currently in a relation. We provide efficient solutions for maintaining a uniformly random sample of a relation in the presence of updates to the relation. The backing sample techniques can be used for any other application that relies on random samples of data.
引用
收藏
页码:261 / 298
页数:38
相关论文
共 33 条
[1]
Aboulnaga A, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P181, DOI 10.1145/304181.304198
[2]
Acharya S, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P275, DOI 10.1145/304181.304207
[3]
ACHARYA S, 2000, P ACM SIGMOD INT C M, P487
[4]
[Anonymous], P ACM SIGMOD INT C M
[5]
[Anonymous], P ACM SIGMOD INT C M
[6]
[Anonymous], 1980, THESIS CASE W RESERV
[7]
[Anonymous], P ACM SIGMOD C MAN D
[8]
[Anonymous], 1949, Human behaviour and the principle of least-effort
[9]
BLOHSFELD B, 1999, P ACM SIGMOD INT C M, P238
[10]
Bruno N., 2001, SIGMOD RECORD, P211