Declustering of key-based partitioned signature files

被引:19
作者
Ciaccia, P [1 ]
Tiberio, P [1 ]
Zezula, P [1 ]
机构
[1] TECH UNIV BRNO,BRNO 60200,CZECH REPUBLIC
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 1996年 / 21卷 / 03期
关键词
design; performance; error correcting codes; information retrieval; parallel independent disks; partial match queries; performance evaluation; superimposed coding;
D O I
10.1145/232753.232755
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Access methods based on signature files can largely benefit from possibilities offered by parallel environments. To this end, an effective. declustering strategy that would distribute signatures over a set of parallel independent disks has to be combined with a synergic clustering which is employed to avoid searching the whole signature file while executing a query. This-article proposes two parallel signature file organizations, Hamming Filter (HF) and Hamming(+) Filter (H+F), whose common declustering strategy is based on error correcting codes, and where clustering is achieved by organizing signatures into fixed-size buckets, each containing signatures sharing the same hey value. HF allocates signatures on disks in a static way and works well if a correct relationship holds between the parameters of the code and the size of the file. H+F is a generalization of HF suitable to manage highly dynamic files. It uses a dynamic declustering, obtained through a sequence of codes, and organizes a smooth migration of signatures between disks so that high performance levels are retained regardless of current file size. Theoretical analysis characterizes the best-case, expected, and worst-case behaviors of these organizations. Analytical results are verified by experiments on prototype systems.
引用
收藏
页码:295 / 338
页数:44
相关论文
共 27 条
[1]   OPTIMAL DISK ALLOCATION FOR PARTIAL MATCH QUERIES [J].
ABDELGHAFFAR, KAS ;
ELABBADI, A .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1993, 18 (01) :132-156
[2]  
AHUJA SR, 1980, 7TH P ANN S COMP ARC, P218
[3]   ESTIMATING ACCESSES IN PARTITIONED SIGNATURE FILE ORGANIZATIONS [J].
CIACCIA, P ;
ZEZULA, P .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1993, 11 (02) :133-142
[4]  
DEPPISCH U, 1986, 1986 P ACM C RES DEV, P77
[5]   PARALLEL DATABASE-SYSTEMS - THE FUTURE OF HIGH-PERFORMANCE DATABASE-SYSTEMS [J].
DEWITT, D ;
GRAY, J .
COMMUNICATIONS OF THE ACM, 1992, 35 (06) :85-98
[6]   DISK ALLOCATION FOR CARTESIAN PRODUCT FILES ON MULTIPLE-DISK SYSTEMS [J].
DU, HC ;
SOBOLEWSKI, JS .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1982, 7 (01) :82-101
[7]   SIGNATURE FILES - AN ACCESS METHOD FOR DOCUMENTS AND ITS ANALYTICAL PERFORMANCE EVALUATION [J].
FALOUTSOS, C ;
CHRISTODOULAKIS, S .
ACM TRANSACTIONS ON OFFICE INFORMATION SYSTEMS, 1984, 2 (04) :267-288
[8]   DISK ALLOCATION METHODS USING ERROR CORRECTING CODES [J].
FALOUTSOS, C ;
METAXAS, D .
IEEE TRANSACTIONS ON COMPUTERS, 1991, 40 (08) :907-914
[9]   OPTIMAL SIGNATURE EXTRACTION AND INFORMATION LOSS [J].
FALOUTSOS, C ;
CHRISTODOULAKIS, S .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1987, 12 (03) :395-428
[10]  
FALOUTSOS C, 1992, INFORMATION RETRIEVA, P44