Optimal disclosure limitation strategy in statistical databases: Deterring tracker attacks through additive noise

被引:41
作者
Duncan, GT [1 ]
Mukherjee, S
机构
[1] Carnegie Mellon Univ, Heinz Sch Publ Policy & Management, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Dept Stat, Pittsburgh, PA 15213 USA
[3] Nova SE Univ, Sch Comp & Informat Sci, Ft Lauderdale, FL 33314 USA
关键词
autocorrelation; computer database; confidentiality; data access; data perturbation; privacy;
D O I
10.2307/2669452
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Disclosure limitation methods transform statistical databases to protect confidentiality, a practical concern of statistical agencies. A statistical database responds to queries with aggregate statistics. The database administrator should maximize legitimate data access while keeping the risk of disclosure below an acceptable level. Legitimate users seek statistical information, generally in aggregate form; malicious users-the data snoopers-attempt to infer confidential information about an individual data subject. Tracker attacks are of special concern for databases accessed online. This article derives optimal disclosure limitation strategies under tracker attacks for the important case of data masking through additive noise. Operational measures of the utility of data access and of disclosure risk are developed The utility of data access is expressed so that trade-offs can be made between the quantity and the quality of data to be released. Application is made to Ohio data from the 1990 census. The article derives conditions under which an attack by a data snooper is better thwarted by a combination of query restriction and data masking than by either disclosure limitation method separately. Data masking by independent noise addition and data perturbation are considered as extreme cases in the continuum of data masking using positively correlated additive noise. Optimal strategies are for the data snooper. Circumstances are determined under which adding autocorrelated noise is preferable to using existing methods of either independent noise addition or data perturbation. Both moving average and autoregressive noise addition are considered.
引用
收藏
页码:720 / 729
页数:10
相关论文
共 44 条
[1]  
ADAM NR, 1989, COMPUT SURV, V21, P515, DOI 10.1145/76894.76895
[2]  
ADAM NR, 1998, P STAT DAT PROT 98 E
[3]  
AHITUV N, 1988, COMPUTERS SECURITY, V7, P59
[4]  
Beck L. L., 1980, ACM Transactions on Database Systems, V5, P316, DOI 10.1145/320613.320617
[5]   DISCLOSURE CONTROL OF MICRODATA [J].
BETHLEHEM, JG ;
KELLER, WJ ;
PANNEKOEK, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1990, 85 (409) :38-45
[6]  
Cassel C., 1976, PROBABILITY BASED DI, P189
[7]  
CASTANO S, 1994, DATABASE SECURITY, P343
[8]   STATISTICAL DATABASE DESIGN [J].
CHIN, FY ;
OZSOYOGLU, G .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1981, 6 (01) :113-139
[10]   Network models for complementary cell suppression [J].
Cox, LH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (432) :1453-1462