Analyze Gauss: Optimal Bounds for Privacy-Preserving Principal Component Analysis

被引：150

作者：

Dwork, Cynthia ^{[1
]}

Talwar, Kunal ^{[1
]}

Thakurta, Abhradeep ^{[1
,2
]}

Zhang, Li ^{[1
]}

机构：

[1] Microsoft Res, Redmond, WA 98052 USA

[2] Stanford Univ, Stanford, CA 94305 USA

来源：

STOC'14: PROCEEDINGS OF THE 46TH ANNUAL 2014 ACM SYMPOSIUM ON THEORY OF COMPUTING | 2014年

关键词：

D O I：

10.1145/2591796.2591883

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We consider the problem of privately releasing a low dimensional approximation to a set of data records, represented as a matrix A in which each row corresponds to an individual and each column to an attribute. Our goal is to compute a subspace that captures the covariance of A as much as possible, classically known as principal component analysis (PCA). We assume that each row of A has l(2) norm bounded by one, and the privacy guarantee is defined with respect to addition or removal of any single row. We show that the well-known, but misnamed, randomized response algorithm, with properly tuned parameters, provides nearly optimal additive quality gap compared to the best possible singular subspace of A. We further show that when A(T)A has a large eigenvalue gap - a reason often cited for PCA - the quality improves significantly. Optimality (up to logarithmic factors) is proved using techniques inspired by the recent work of Bun, Ullman, and Vadhan on applying Tardos's fingerprinting codes to the construction of hard instances for private mechanisms for 1-way marginal queries. Along the way we define a list culling game which may be of independent interest. By combining the randomized response mechanism with the well-known following the perturbed leader algorithm of Kalai and Vempala we obtain a private online algorithm with nearly optimal regret. The regret of our algorithm even outperforms all the previously known online non-private algorithms of this type. We achieve this better bound by, satisfyingly, borrowing insights and tools from differential privacy!

引用

页码：11 / 20

页数：10

共 33 条

[1]

Achlioptas Dimitris, 2007, J ACM JACM, V54

[2]

[Anonymous], TOPICS RANDOM MATRIX

[3]

[Anonymous], FDN COMPUTATIONAL MA

[4]

[Anonymous], 2005, PODS

[5]

[Anonymous], ARXIV10042008

[6]

[Anonymous], 2013, COMMUNICATION

[7] Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection [J].

Belhumeur, PN ;

Hespanha, JP ;

Kriegman, DJ .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) :711-720

[8]

Bentley Jon Louis, 1980, J ALGORITHMS, V1

[9] Collusion-secure fingerprinting for digital data [J].

Boneh, D ;

Shaw, J .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (05) :1897-1905

[10]

Bun Mark, THESE P

← 1 2 3 4 →