STATISTICAL ESTIMATORS FOR AGGREGATE RELATIONAL ALGEBRA QUERIES

被引:23
作者
HOU, WC
OZSOYOGLU, GK
机构
[1] Case Western Reserve Univ., Cleveland, OH
[2] Case Western Reserve Univ., Cleveland, OH
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 1991年 / 16卷 / 04期
关键词
ALGORITHMS; MANAGEMENT; PERFORMANCE; THEORY; RELATIONAL ALGEBRA; SAMPLING; SELECTIVITY; SIMPLE RANDOM SAMPLING; STATISTICAL ESTIMATORS;
D O I
10.1145/115302.115300
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper discusses the estimation of COUNT(E) queries by sampling, where E is an arbitrary relational algebra expression. Consistent and unbiased statistical estimators for COUNT(E) are proposed without any assumptions on the distributions of attribute values or the orderings of tuples in the operand relations of E. We present a set of COUNT(E) estimator evaluation algorithms, all based on simple random sampling, and each suitable for a different type of relational algebra expression E. To improve the efficiency, we propose three enhancements, and revise the estimator evaluation algorithms to incorporate the enhancements. One of the enhancements is the use of cluster sampling. Estimator evaluation algorithms with the enhancements have been incorporated into a prototype DBMS. We present the performance evaluation of the estimators using the implemented prototype DBMS.
引用
收藏
页码:600 / 654
页数:55
相关论文
共 32 条
[1]  
CHAO A, 1984, SCAND J STAT, V11, P265
[2]   ESTIMATING RECORD SELECTIVITIES [J].
CHRISTODOULAKIS, S .
INFORMATION SYSTEMS, 1983, 8 (02) :105-115
[3]  
Cochran W. G., 2007, SAMPLING TECHNIQUES
[4]  
DATA A, 1986, 3RD P INT WORKSH STA, P245
[5]  
DEVORE J, 1984, PROBABILITY STATISTI
[6]  
FISHER RA, 1921, PHILOS T ROYAL SOC, V222
[7]  
FISHER RA, 1925, P CAMBRIDGE PHILOS S, V22
[8]  
FORTUNATO E, 1986, 3RD P INT WORKSH STA
[9]  
GHOSH S, 1987, IBM5605 RES REP
[10]  
GHOSH S, 1986, 3RD P INT WORKSH STA, P286