An efficient clustering algorithm for k-anonymisation

被引:22
作者
Loukides, Grigorios [1 ]
Shao, Jian-Hua [1 ]
机构
[1] Cardiff Univ, Sch Comp Sci, Cardiff, Wales
关键词
k-anonymisation; data privacy; greedy clustering;
D O I
10.1007/s11390-008-9121-3
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 [计算机科学与技术];
摘要
K-anonymisation is an approach to protecting individuals from being identified from data. Good k-anonymisations should retain data utility and preserve privacy, but few methods have considered these two conflicting requirements together. In this paper, we extend our previous work on a clustering-based method for balancing data utility and privacy protection, and propose a set of heuristics to improve its effectiveness. We introduce new clustering criteria that treat utility and privacy on equal terms and propose sampling-based techniques to optimally set up its parameters. Extensive experiments show that the extended method achieves good accuracy in query answering and is able to prevent linking attacks effectively.
引用
收藏
页码:188 / 202
页数:15
相关论文
共 22 条
[1]
Aggarwal CC, 2004, LECT NOTES COMPUT SC, V2992, P183
[2]
Bayardo RJ, 2005, PROC INT CONF DATA, P217
[3]
Byun JW, 2007, LECT NOTES COMPUT SC, V4443, P188
[4]
Fung BCM, 2005, PROC INT CONF DATA, P205
[5]
Gehrke J., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P416
[6]
HETTICH S, 1999, UCI REPOSITORY MACHI
[7]
Iyengar VS., 2002, P 8 ACM SIGKDD INT C, P279, DOI DOI 10.1145/775047.775089
[8]
LeFevre K., 2006, P 12 ACM SIGKDD INT, P277
[9]
LeFevre K., 2005, P 2005 ACM SIGMOD IN, DOI [DOI 10.1145/1066157.1066164, 10.1145/1066157.1066164]
[10]
LeFevre K., 2006, 22 INT C DAT ENG ICD, P25