REFINEMENTS TO NEAREST-NEIGHBOR SEARCHING IN K-DIMENSIONAL TREES

被引:218
作者
SPROULL, RF
机构
[1] Sutherland, Sproull and Associates, Palo Alto, 94302, CA
关键词
K-DIMENSIONAL TREE; SEARCHING; NEAREST-NEIGHBOR SEARCH;
D O I
10.1007/BF01759061
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This note presents a simplification and generalization of an algorithm for searching k-dimensional trees for nearest neighbors reported by Friedman et al. [3]. If the distance between records is measured using L2, the Euclidean norm, the data structure used by the algorithm to determine the bounds of the search space can be simplified to a single number. Moreover, because distance measurements in L2 are rotationally invariant, the algorithm can be generalized to allow a partition plane to have an arbitrary orientation, rather than insisting that it be perpendicular to a coordinate axis, as in the original algorithm. When a k-dimensional tree is built, this plane can be found from the principal eigenvector of the covariance matrix of the records to be partitioned. These techniques and others yield variants of k-dimensional trees customized for specific applications. It is wrong to assume that k-dimensional trees guarantee that a nearest-neighbor query completes in logarithmic expected time. For small k, logarithmic behavior is observed on all but tiny trees. However, for larger k, logarithmic behavior is achievable only with extremely large numbers of records. For k = 16, a search of a k-dimensional tree of 76,000 records examines almost every record.
引用
收藏
页码:579 / 589
页数:11
相关论文
共 4 条