Stable distributions, pseudorandom generators, embeddings, and data stream computation

被引：203

作者：

Indyk, Piotr

机构：

[1] MIT, Cambridge, MA 02139 USA

[2] Stanford Univ, Stanford, CA 94305 USA

[3] AT&T Shannon Labs, Florham Pk, NJ USA

来源：

JOURNAL OF THE ACM | 2006年 / 53卷 / 03期

关键词：

algorithms; theory; sketching; dimensionality reduction; embeddings; data streams; norms;

D O I：

10.1145/1147954.1147955

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 [计算机科学与技术];

摘要：

In this article, we show several results obtained by combining the use of stable distributions with pseudorandom generators for bounded space. In particular: We show that, for any p is an element of (0, 2], one can maintain (using only O (log n/epsilon(2)) words of storage) a sketch C (q) of a point q is an element of l(p)(n) under dynamic updates of its coordinates. The sketch has the property that, given C (q) and C(s), one can estimate parallel to q-s parallel to(p) up to a factor of (1+epsilon) with large probability. This solves the main open problem of Feigenbaum et al. [1999]. We show that the aforementioned sketching approach directly translates into an approximate algorithm that, for a fixed linear mapping A, and given x is an element of R-n and y is an element of R-m, estimates parallel to Ax-y parallel to(p) in O(n+m) time, for any p is an element of (0, 2]. This generalizes an earlier algorithm of Wasserman and Blum (19971 which worked for the case p=2. We obtain another sketch function C' which probabilistically embeds l(1)(n) into a normed space l(1)(m). The embedding guarantees that, if we set m=log(1/delta)(O(1/epsilon)), then for any pair of points q, s is an element of l(1)(n), the distance between q and s does not increase by more than (1+6 epsilon) with constant probability, and it does not decrease by more than (1-epsilon) with probability 1-delta. This is the only known dimensionality reduction theorem for the l(1) norm. In fact, stronger theorems of this type (i.e., that guarantee very low probability of expansion as well as of contraction) cannot exist [Brinkman and Charikar 2003]. We give an explicit embedding of l(2)(n) into with distortion (1+1/n(Theta(1))).

引用

页码：307 / 323

页数：17

共 40 条

[1]

Alon N., 1996, Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, P20, DOI 10.1145/237814.237823

[2]

[Anonymous], P S THEOR COMP

[3]

[Anonymous], 2003, DATA STREAMS ALGORIT

[4]

[Anonymous], P 16 INT C DAT ENG I

[5]

AR S, 1993, P ANN ACM S THEOR CO

[6]

BERGER B, 1997, SIAM J COMPUT, P26

[7]

BRINKMAN B, 2003, P 44 ANN IEEE S FDN

[8]

BRODER A, 1998, P FUN

[9]

Broder A. Z., 1997, P 6 INT WORLD WID WE, V29, P1157, DOI [DOI 10.1016/S0169-7552(97)00031-7, 10.1016/S0169-7552(97)00031-7]

[10]

METHOD FOR SIMULATING STABLE RANDOM-VARIABLES [J].

CHAMBERS, JM ;

MALLOWS, CL ;

STUCK, BW .

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1976, 71 (354) :340-344

← 1 2 3 4 →