Anomaly detection in data represented as graphs

被引:60
作者
Eberle, William [1 ]
Holder, Lawrence [2 ]
机构
[1] Tennessee Technol Univ, Dept Comp Sci, Cookeville, TN 38505 USA
[2] Washington State Univ, Sch Elect Engn & Comp Sci, Pullman, WA 99164 USA
关键词
graph-based anomaly detection; minimum description length principle; information theoretic compression;
D O I
10.3233/IDA-2007-11606
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An important area of data mining is anomaly detection, particularly for fraud. However, little work has been done in terms of detecting anomalies in data that is represented as a graph. In this paper we present graph-based approaches to uncovering anomalies in domains where the anomalies consist of unexpected entity/relationship alterations that closely resemble non-anomalous behavior. We have developed three algorithms for the purpose of detecting anomalies in all three types of possible graph changes: label modifications, vertex/edge insertions and vertex/edge deletions. Each of our algorithms focuses on one of these anomalous types, using the minimum description length principle to first discover the normative pattern. Once the common pattern is known, each algorithm then uses a different approach to discover particular anomalous types. In this paper, we validate all three approaches using synthetic data, verifying that each of the algorithms on graphs and anomalies of varying sizes, are able to detect the anomalies with very high detection rates and minimal false positives. We then further validate the algorithms using real-world cargo data and actual fraud scenarios injected into the data set with 100% accuracy and no false positives. Each of these algorithms demonstrates the usefulness of examining a graph-based representation of data for the purposes of detecting fraud.
引用
收藏
页码:663 / 689
页数:27
相关论文
共 15 条
[1]  
[Anonymous], 2005, SIGKDD Explorations
[2]  
Chakrabarti D, 2004, LECT NOTES ARTIF INT, V3202, P112
[3]   Graph-based data mining [J].
Cook, DJ ;
Holder, LB .
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 2000, 15 (02) :32-+
[4]  
*CUST BORD PROT TO, 2003, ILL TEXT ENTR WAY SA
[5]  
EBERLE W, 2006, P IEEE INT SEC INFO
[6]  
Faloutsos M, 1999, COMP COMM R, V29, P251, DOI 10.1145/316194.316229
[7]   Fast spinning into oblivion? Recent developments in money-laundering policies and offshore finance centres [J].
Hampton, MP ;
Levi, M .
THIRD WORLD QUARTERLY, 1999, 20 (03) :645-656
[8]  
*KDD CUP 1999, 1999, KNOWL DISC DAT MIN T
[9]  
Lin SD, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P171
[10]  
Noble CC, 2003, P 9 ACM SIGKDD INT C, P631, DOI [DOI 10.1145/956750.956831, 10.1145/956750.956831]