Hiding sensitive knowledge without side effects

被引:38
作者
Gkoulalas-Divanis, Aris [1 ]
Verykios, Vassilios S. [1 ]
机构
[1] Univ Thessaly, Dept Comp & Commun Engn, Volos 38221, Greece
关键词
Data mining; Association rule hiding; Borders of frequent itemsets; Parallelization;
D O I
10.1007/s10115-008-0178-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sensitive knowledge hiding in large transactional databases is one of the major goals of privacy preserving data mining. However, it is only recently that researchers were able to identify exact solutions for the hiding of knowledge, depicted in the form of sensitive frequent itemsets and their related association rules. Exact solutions allow for the hiding of vulnerable knowledge without any critical compromises, such as the hiding of nonsensitive patterns or the accidental uncovering of infrequent itemsets, amongst the frequent ones, in the sanitized outcome. In this paper, we highlight the process of border revision, which plays a significant role towards the identification of exact hiding solutions, and we provide efficient algorithms for the computation of the revised borders. Furthermore, we review two algorithms that identify exact hiding solutions, and we extend the functionality of one of them to effectively identify exact solutions for a wider range of problems (than its original counterpart). Following that, we introduce a novel framework for decomposition and parallel solving of hiding problems, which are handled by each of these approaches. This framework improves to a substantial degree the size of the problems that both algorithms can handle and significantly decreases their runtime. Through experimentation, we demonstrate the effectiveness of these approaches toward providing high quality knowledge hiding solutions.
引用
收藏
页码:263 / 299
页数:37
相关论文
共 39 条
  • [1] Agarwal R., 1994, VLDB, V487, P499, DOI DOI 10.5555/645920.672836
  • [2] Parallel mining of association rules
    Agrawal, R
    Shafer, JC
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) : 962 - 969
  • [3] Agrawal R., 2000, Privacy-preserving data mining, P439, DOI DOI 10.1145/342009.335438
  • [4] [Anonymous], 1999, KDEX WORKSH, DOI [10.1109/KDEX.1999.836532, DOI 10.1109/KDEX.1999.836532]
  • [5] Bayardo Jr R.J., 1998, P 1998 ACM SIGMOD IN
  • [6] A framework for evaluating privacy preserving data mining algorithms
    Bertino, E
    Fovino, IN
    Provenza, LP
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (02) : 121 - 154
  • [7] Cheung DW, 1998, LECT NOTES ARTIF INT, V1394, P48
  • [8] Cliff William H., 1996, American Journal of Physiology, V270, pS19
  • [9] CLIFTON C, 2002, NAT SCI FDN WORKSH N, P126
  • [10] Dasseni E., 2001, IHW 01, P369, DOI DOI 10.1007/3-540-45496-9_27