Relaxing the uniformity and independence assumptions using the concept of fractal dimension

被引:9
作者
Faloutsos, C
Kamel, I
机构
[1] UNIV MARYLAND, SYST RES INST, COLLEGE PK, MD 20742 USA
[2] MATSUSHITA INFORMAT TECHNOL LAB, PRINCETON, NJ USA
基金
美国国家科学基金会;
关键词
Number: EEC-94-02384,IRI-8958546,IRI-9205273, Acronym: NSF, Sponsor: National Science Foundation, Number: 8958546,9205273, Acronym: CISE, Sponsor: Directorate for Computer and Information Science and Engineering,;
D O I
10.1006/jcss.1997.1522
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We propose the concept of fractal dimension of a set of points, in order to quantify the deviation from the uniformity distribution. Using measurements on real data sets (road intersections of U.S. counties, population versus area of different nations, etc.) we provide evidence that real data indeed are skewed, and, moreover, we show that for several scales of interest they behave as mathematical fractals with a measurable noninteger fractal dimension. Armed with this tool, we then show its practical use in predicting the performance of spatial access methods and, specifically, of R-trees. We provide the first analysis of R-trees for skewed distributions of points; we develop a formula that estimates the number of disk accesses for range queries, given only the fractal dimension of the point set and its count. Experiments on real data sets show that the formula is very accurate; the relative error is usually below 5%, and it rarely exceeds 10%. We believe that the fractal dimension will help replace the uniformity and independence assumptions, allowing more accurate analysis for any spatial access method, as well as better estimates for query optimization on multi attribute queries. (C) 1997 Academic Press.
引用
收藏
页码:229 / 240
页数:12
相关论文
共 29 条
[1]  
AREF WG, 1991, PROC INT CONF VERY L, P81
[2]  
ARYA M, 1994, 10TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, P314
[3]  
ARYA M, 1993, IEEE DATA ENG B, V16, P38
[4]  
BARNSLEY MF, 1988, BYTE, V13, P215
[5]  
BECKMANN N, 1990, SIGMOD REC, V19, P322, DOI 10.1145/93605.98741
[6]  
Belussi A., 1995, PROC 21 INT C VERY, P299, DOI [10.5555/215437, DOI 10.5555/215437]
[7]  
CHRISTODOULAKIS S, 1984, ACM TODS
[8]  
FALOUTSOS C, 1992, 18 VLDB C VANC BRIT, P363
[9]  
FALOUTSOS C, 11253095110815TM ATT
[10]  
FALOUTSOS C, 1987, MAY P SIGMOD C SAN F, P426