Mining Attribute-structure Correlated Patterns in Large Attributed Graphs

被引:91
作者
Silva, Arlei [1 ]
Meira, Wagner, Jr. [1 ]
Zaki, Mohammed J. [2 ]
机构
[1] Univ Fed Minas Gerais, Belo Horizonte, MG, Brazil
[2] Rensselaer Polytech Inst, Troy, NY USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2012年 / 5卷 / 05期
基金
美国国家科学基金会;
关键词
D O I
10.14778/2140436.2140443
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we study the correlation between attribute sets and the occurrence of dense subgraphs in large attributed graphs, a task we call structural correlation pattern mining. A structural correlation pattern is a dense subgraph induced by a particular attribute set. Existing methods are not able to extract relevant knowledge regarding how vertex attributes interact with dense subgraphs. Structural correlation pattern mining combines aspects of frequent itemset and quasi-clique mining problems. We propose statistical significance measures that compare the structural correlation of attribute sets against their expected values using null models. Moreover, we evaluate the interestingness of structural correlation patterns in terms of size and density. An efficient algorithm that combines search and pruning strategies in the identification of the most relevant structural correlation patterns is presented. We apply our method for the analysis of three real-world attributed graphs: a collaboration, a music, and a citation network, verifying that it provides valuable knowledge in a feasible time.
引用
收藏
页码:466 / 477
页数:12
相关论文
共 22 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
Anagnostopoulos A., 2008, P 14 ACM SIGKDD INT, P7, DOI DOI 10.1145/1401890.1401897
[3]  
[Anonymous], 2005, P 31 INT C VERY LARG, DOI DOI 10.5555/1083592.1083676
[4]   Community detection in graphs [J].
Fortunato, Santo .
PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2010, 486 (3-5) :75-174
[5]  
Ge R., 2008, ACM T KNOWL DISCOV D, V2, P1, DOI DOI 10.1145/1376815.1376816
[6]   Community structure in social and biological networks [J].
Girvan, M ;
Newman, MEJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (12) :7821-7826
[7]  
Guan Z., 2011, PROC ACM SIGMOD INT, P937
[8]  
Jiang D., 2009, ACM T KNOWL DISCOV D, V2, P16
[9]  
Khan A., 2010, P 2010 ACM SIGMOD IN, P867, DOI [10.1145/1807167.1807261, DOI 10.1145/1807167.1807261]
[10]  
Liu GM, 2008, LECT NOTES ARTIF INT, V5212, P33