Closeness: A New Privacy Measure for Data Publishing

被引：116

作者：

Li, Ninghui ^{[1
]}

Li, Tiancheng ^{[1
]}

Venkatasubramanian, Suresh ^{[2
]}

机构：

[1] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47906 USA

[2] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2010年 / 22卷 / 07期

关键词：

Privacy preservation; data anonymization; data publishing; data security; K-ANONYMITY;

D O I：

10.1109/TKDE.2009.139

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain "identifying" attributes) contains at least k records. Recently, several authors have recognized that k-anonymity cannot prevent attribute disclosure. The notion of l-diversity has been proposed to address this; l-diversity requires that each equivalence class has at least l well-represented ( in Section 2) values for each sensitive attribute. In this paper, we show that l-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Motivated by these limitations, we propose a new notion of privacy called "closeness." We first present the base model t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold t). We then propose a more flexible privacy model called (n, t)-closeness that offers higher utility. We describe our desiderata for designing a distance measure between two probability distributions and present two distance measures. We discuss the rationale for using closeness as a privacy measure and illustrate its advantages through examples and experiments.

引用

页码：943 / 956

页数：14

共 41 条

[1]

AGGARWAL G, 2006, P 25 ACM SIGMOD SIGA, P153, DOI DOI 10.1145/1142351.1142374

[2]

Ahuja R., 1993, NETWORK FLOWS THEORY

[3]

[Anonymous], 2006, P 32 INT C VER LARG

[4]

[Anonymous], 1995, MONOGRAPHS STAT APPL

[5]

[Anonymous], 2005, VLDB, DOI DOI 10.5555/1083592.1083696

[6]

[Anonymous], 2007, Uci machine learning repository

[7]

BACCHUS F, 1992, AAAI-92 PROCEEDINGS : TENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, P602

[8]

Bayardo RJ, 2005, PROC INT CONF DATA, P217

[9]

Byun JW, 2006, LECT NOTES COMPUT SC, V4165, P48

[10]

CHEN BC, 2007, P 33 INT C VER LARG, P770

← 1 2 3 4 5 →