Secure regression on distributed databases

被引:68
作者
Karr, AF [1 ]
Lin, XD
Sanil, AP
Reiter, JP
机构
[1] Natl Inst Stat Sci, Res Triangle Pk, NC 27709 USA
[2] Duke Univ, Inst Stat & Decis Sci, Durham, NC 27708 USA
基金
美国国家科学基金会;
关键词
data confidentiality; data integration; diagnostics; secure multiparty computation;
D O I
10.1198/106186005X47714
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This article presents several methods for performing linear regression on the union of distributed databases that preserve, to varying degrees, confidentiality of those databases. Such methods can be used by federal or state statistical agencies to share information from their individual databases, or to make such information available to others. Secure data integration, which provides the lowest level of protection, actually integrates the databases, but in a manner that no database owner can determine the origin of any records other than its own. Regression, associated diagnostics, or any other analysis then can be performed on the integrated data. Secure multiparty computation, based on shared local statistics effects computations necessary to compute least squares estimators of regression coefficients and error variances by means of analogous local computations that are combined additively using the secure summation protocol. We also provide two approaches to model diagnostics in this setting, one using shared residual statistics and the other using secure integration of synthetic residuals.
引用
收藏
页码:263 / 279
页数:17
相关论文
共 38 条
[1]  
[Anonymous], 2000, Privacy-preserving data mining, DOI DOI 10.1145/342009.335438
[2]  
BENALOH J, 1987, LECT NOTES COMPUTER, V263
[3]   Preserving confidentiality of high-dimensional tabulated data: Statistical and computational issues [J].
Dobra, A ;
Karr, AF ;
Sanil, AP .
STATISTICS AND COMPUTING, 2003, 13 (04) :363-370
[4]   Software systems for tabular data releases [J].
Dobra, A ;
Karr, AF ;
Sanil, AP ;
Fienberg, SE .
INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2002, 10 (05) :529-544
[5]  
Doyle Pat, 2001, CONFIDENTIALITY DISC
[6]  
DU W, 2002, NEW SEC PAR WORKSH, P127
[7]  
Du WL, 2004, SIAM PROC S, P222
[8]  
Duncan George T., 1993, Private lives and public policies: Confidentiality and accessibility of government statistics
[9]  
DUNCAN GT, 2004, UNPUB MANAGEMENT SCI
[10]  
Duncan GT, 2004, CHANCE, V17, P16, DOI DOI 10.1080/09332480.2004.10554908