The Price of Privacy and the Limits of LP Decoding

被引:87
作者
Dwork, Cynthia [1 ]
McSherry, Frank [1 ]
Talwar, Kunal [1 ]
机构
[1] Microsoft Res Silicon Valley, Mountain View, CA 94043 USA
来源
STOC 07: PROCEEDINGS OF THE 39TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING | 2007年
关键词
Privacy; LP Decoding; Compressed Sensing; Basis Pursuit;
D O I
10.1145/1250790.1250804
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This work is at the intersection of two lines of research. One line, initiated by Dinur and Nissim, investigates the price, in accuracy, of protecting privacy in a statistical database. The second, growing from an extensive literature on compressed sensing (see in particular the work of Donoho and collaborators [14, 7, 13, 11]) and explicitly connected to error-correcting codes by Candes and Tao, ([4]; see also [5, 3]), is in the use of linear programming for error correction. Our principal result is the discovery of a sharp threshold rho* approximate to 0.239, so that if rho < rho* and A is a random m x n encoding matrix of independently chosen standard Gaussians, where m = O(n), then with overwhelming probability over choice of A, for all x is an element of R-n, LP decoding corrects [rho m] arbitrary errors in the encoding Ax, while decoding can be made to fail if the error rate exceeds rho*. Our bound resolves an open question of Candes, Rudelson, Tao, and Vershyin [3] and (oddly, but explicably) refutes empirical conclusions of Donoho [11] and Candes et al [3]. By scaling and rounding we can easily transform these results to obtain polynomial-time decodable random linear codes with polynomial-sized alphabets tolerating any rho < rho* approximate to 0.239 fraction of arbitrary errors. In the context of privacy-preserving datamining our results say that any privacy mechanism, interactive or non-interactive, providing reasonably accurate answers to a 0.761 fraction of randomly generated weighted subset sum queries, and arbitrary answers on the remaining 0.239 fraction, is blatantly non-private.
引用
收藏
页码:85 / 94
页数:10
相关论文
共 25 条
[1]   Database-friendly random projections: Johnson-Lindenstrauss with binary coins [J].
Achlioptas, D .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (04) :671-687
[2]  
Blum A., 2005, P 24 ACM SIGMOD SIGA, P128, DOI [DOI 10.1145/1065167.1065184, 10.1145/1065167.1065184]
[3]  
Candes E, 2005, ANN IEEE SYMP FOUND, P295
[4]   Decoding by linear programming [J].
Candes, EJ ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) :4203-4215
[5]  
CANDES EJ, 2005, ERROR CORRECTION VIA
[6]  
CANDS EJ, 2006, HIGHLY ROBUST UNPUB
[7]  
CHEN S, 1999, SIAM J SCI COMPUT, V48, P33
[8]  
DINUR I, 2005, PRIVACY PUBLIC UNPUB
[9]  
Dinur Irit, 2003, PODS, P202, DOI DOI 10.1145/773153.773173
[10]  
DONOHO D, 2006, P C INF SCI SYST