Declustering of key-based partitioned signature files

被引:19
作者
Ciaccia, P [1 ]
Tiberio, P [1 ]
Zezula, P [1 ]
机构
[1] TECH UNIV BRNO,BRNO 60200,CZECH REPUBLIC
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 1996年 / 21卷 / 03期
关键词
design; performance; error correcting codes; information retrieval; parallel independent disks; partial match queries; performance evaluation; superimposed coding;
D O I
10.1145/232753.232755
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Access methods based on signature files can largely benefit from possibilities offered by parallel environments. To this end, an effective. declustering strategy that would distribute signatures over a set of parallel independent disks has to be combined with a synergic clustering which is employed to avoid searching the whole signature file while executing a query. This-article proposes two parallel signature file organizations, Hamming Filter (HF) and Hamming(+) Filter (H+F), whose common declustering strategy is based on error correcting codes, and where clustering is achieved by organizing signatures into fixed-size buckets, each containing signatures sharing the same hey value. HF allocates signatures on disks in a static way and works well if a correct relationship holds between the parameters of the code and the size of the file. H+F is a generalization of HF suitable to manage highly dynamic files. It uses a dynamic declustering, obtained through a sequence of codes, and organizes a smooth migration of signatures between disks so that high performance levels are retained regardless of current file size. Theoretical analysis characterizes the best-case, expected, and worst-case behaviors of these organizations. Analytical results are verified by experiments on prototype systems.
引用
收藏
页码:295 / 338
页数:44
相关论文
共 27 条
[11]   PERFORMANCE ANALYSIS OF DISK ALLOCATION METHOD USING ERROR-CORRECTING CODES [J].
FUJIWARA, T ;
ITO, M ;
KASAMI, T ;
KATAOKA, M ;
OKUI, J .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (02) :379-384
[12]  
GRANDI F, 1992, P 15 ACM SIGIR INT C, P286
[13]  
Lee D. L., 1990, Sixth International Conference on Data Engineering (Cat. No.90CH2840-7), P389, DOI 10.1109/ICDE.1990.113492
[14]   PARTITIONED SIGNATURE FILES - DESIGN ISSUES AND PERFORMANCE EVALUATION [J].
LEE, DL ;
LENG, CW .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1989, 7 (02) :158-180
[15]  
LEE DL, 1986, 2ND P INT C DAT ENG, P352
[16]   FRAME-SLICED SIGNATURE FILES [J].
LIN, Z ;
FALOUTSOS, C .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1992, 4 (03) :281-289
[17]   CONCURRENT FRAME SIGNATURE FILES [J].
LIN, Z .
DISTRIBUTED AND PARALLEL DATABASES, 1993, 1 (03) :231-249
[18]  
MacWilliams F. J., 1988, THEORY ERROR CORRECT
[19]  
Peterson W. W., 1972, ERROR CORRECTING COD
[20]   USE OF TEXT SIGNATURES FOR DOCUMENT-RETRIEVAL IN A HIGHLY PARALLEL ENVIRONMENT [J].
POGUE, CA ;
WILLETT, P .
PARALLEL COMPUTING, 1987, 4 (03) :259-268