Pivot selection techniques for proximity searching in metric spaces

被引:115
作者
Bustos, B
Navarro, G
Chávez, E
机构
[1] Univ Konstanz, Dept Comp & Informat Sci, D-78457 Constance, Germany
[2] Univ Chile, Ctr Web Res, Dept Comp Sci, Santiago, Chile
[3] Univ Michoacana, Escuela Ciencias Fis Matemat, Morelia, Michoacan, Mexico
关键词
metric databases; range queries; pivot-based indexing; nearest neighbor search;
D O I
10.1016/S0167-8655(03)00065-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With few exceptions, proximity search algorithms in metric spaces based on the use of pivots select them at random among the objects of the metric space. However, it is well known that the way in which the pivots are selected can drastically affect the performance of the algorithm. Between two sets of pivots of the same size, better chosen pivots can largely reduce the search time. Alternatively, a better chosen small set of pivots (requiring much less space) can yield the same efficiency as a larger, randomly chosen, set. We propose an efficiency measure to compare two pivot sets, combined with an optimization technique that allows us to select good sets of pivots. We obtain abundant empirical evidence showing that our technique is effective, and it is the first that we are aware of in producing consistently good results in a wide variety of cases and in being based on a formal theory. We show that good pivots are outliers, but that selecting outliers does not ensure that good pivots are selected. (C) 2003 Elsevier B.V. All rights reserved.
引用
收藏
页码:2357 / 2366
页数:10
相关论文
共 13 条
[1]  
[Anonymous], 1995, WILLIAM MARY J RACE
[2]  
[Anonymous], P SPIRE CRIWG
[3]  
Baeza-Yates R., 1994, LNCS, V807/1994, P198
[4]  
BOZKAYA T, 1997, P ACM SIGMOD INT C M, P357
[5]  
BURKHARD WA, 1973, COMMUN ACM, V16, P230, DOI 10.1145/362003.362025
[6]   Searching in metric spaces [J].
Chávez, E ;
Navarro, G ;
BaezaYates, R ;
Marroquín, JL .
ACM COMPUTING SURVEYS, 2001, 33 (03) :273-321
[7]   Fixed queries array:: A fast and economical data structure for proximity searching [J].
Chávez, E ;
Marroquín, JL ;
Navarro, G .
MULTIMEDIA TOOLS AND APPLICATIONS, 2001, 14 (02) :113-135
[8]   FAST NEAREST-NEIGHBOR SEARCH IN DISSIMILARITY SPACES [J].
FARAGO, A ;
LINDER, T ;
LUGOSI, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (09) :957-962
[9]   A NEW VERSION OF THE NEAREST-NEIGHBOR APPROXIMATING AND ELIMINATING SEARCH ALGORITHM (AESA) WITH LINEAR PREPROCESSING TIME AND MEMORY REQUIREMENTS [J].
MICO, ML ;
ONCINA, J ;
VIDAL, E .
PATTERN RECOGNITION LETTERS, 1994, 15 (01) :9-17
[10]  
Santos RF, 2001, PROC INT CONF DATA, P623, DOI 10.1109/ICDE.2001.914877