THE FAST JOHNSON-LINDENSTRAUSS TRANSFORM AND APPROXIMATE NEAREST NEIGHBORS

被引:245
作者
Ailon, Nir [1 ]
Chazelle, Bernard [2 ]
机构
[1] Google Res, New York, NY 10011 USA
[2] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
关键词
dimension reduction; random matrices; approximate nearest neighbors; QUERIES;
D O I
10.1137/060673096
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We introduce a new low-distortion embedding of l(2)(d) into l(p)(O(log n)) (p = 1, 2) called the fast Johnson-Lindenstrauss transform (FJLT). The FJLT is faster than standard random projections and just as easy to implement. It is based upon the preconditioning of a sparse projection matrix with a randomized Fourier transform. Sparse random projections are unsuitable for low-distortion embeddings. We overcome this handicap by exploiting the "Heisenberg principle" of the Fourier transform, i.e., its local-global duality. The FJLT can be used to speed up search algorithms based on low-distortion embeddings in l(1) and l(2). We consider the case of approximate nearest neighbors in l(2)(d). We provide a faster algorithm using classical projections, which we then speed up further by plugging in the FJLT. We also give a faster algorithm for searching over the hypercube.
引用
收藏
页码:302 / 322
页数:21
相关论文
共 42 条
[1]   Database-friendly random projections: Johnson-Lindenstrauss with binary coins [J].
Achlioptas, D .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :671-687
[2]  
AILON N, DISCRETE CO IN PRESS
[3]   Problems and results in extremal combinatorics - I [J].
Alon, N .
DISCRETE MATHEMATICS, 2003, 273 (1-3) :31-53
[4]  
Alon N., 2004, PROBABILISTIC METHOD
[5]  
[Anonymous], 2004, HDB DISCRETE COMPUTA
[6]  
ARYA S, 1993, PROCEEDINGS OF THE FOURTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P271
[7]   An optimal algorithm for approximate nearest neighbor searching in fixed dimensions [J].
Arya, S ;
Mount, DM ;
Netanyahu, NS ;
Silverman, R ;
Wu, AY .
JOURNAL OF THE ACM, 1998, 45 (06) :891-923
[8]   APPROXIMATE CLOSEST-POINT QUERIES IN HIGH DIMENSIONS [J].
BERN, M .
INFORMATION PROCESSING LETTERS, 1993, 45 (02) :95-99
[9]  
Bingham E., 2001, KDD-2001. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P245, DOI 10.1145/502512.502546
[10]  
Borodin A., 1999, Proceedings of the Thirty-First Annual ACM Symposium on Theory of Computing, P312, DOI 10.1145/301250.301330