Distribution-based anomaly detection via generalized likelihood ratio test: A general Maximum Entropy approach

被引:44
作者
Coluccia, A. [1 ]
D'Alconzo, A. [2 ]
Ricciato, F. [1 ]
机构
[1] Univ Salento, I-73100 Lecce, Italy
[2] FTW, Vienna, Austria
关键词
Anomaly detection; Maximum Entropy (ME); Network traffic; Generalized Likelihood Ratio Test (GLRT); Maximum Likelihood (ML); 3G cellular networks;
D O I
10.1016/j.comnet.2013.07.028
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
080201 [机械制造及其自动化];
摘要
We address the problem of detecting "anomalies" in the network traffic produced by a large population of end-users following a distribution-based change detection approach. In the considered scenario, different traffic variables are monitored at different levels of temporal aggregation (timescales), resulting in a grid of variable/timescale nodes. For every node, a set of per-user traffic counters is maintained and then summarized into histograms for every time bin, obtaining a timeseries of empirical (discrete) distributions for every variable/timescale node. Within this framework, we tackle the problem of designing a formal Distribution-based Change Detector (DCD) able to identify statistically-significant deviations from the past behavior of each individual timeseries. For the detection task we propose a novel methodology based on a Maximum Entropy (ME) modeling approach. Each empirical distribution (sample observation) is mapped to a set of ME model parameters, called "characteristic vector", via closed-form Maximum Likelihood (ML) estimation. This allows to derive a detection rule based on a formal hypothesis test (Generalized Likelihood Ratio Test, GLRT) to measure the coherence of the current observation, i.e., its characteristic vector, to the given reference. The latter is dynamically identified taking into account the typical non-stationarity displayed by real network traffic. Numerical results on synthetic data demonstrates the robustness of our detector, while the evaluation on a labeled dataset from an operational 3G cellular network confirms the capability of the proposed method to identify real traffic anomalies. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:3446 / 3462
页数:17
相关论文
共 50 条
[1]
[Anonymous], ACM SIGCOMM
[2]
[Anonymous], ACM SIGCOMM
[3]
[Anonymous], 6 C NAT LANG LEARN
[4]
[Anonymous], 2004, ACM SIGMETRICS
[5]
[Anonymous], 2001, DETECTION ESTIMATION
[6]
[Anonymous], 2000, STAT PHYS SPATIAL ST
[7]
Barford, 2002, ACM SIGCOMM 02
[8]
Berger Della Pietra, 1996, COMPUTATIONAL LINGUI, V22
[9]
Bowen R., 1975, LN MATH, V470
[10]
Burgess, 2002, ACM T COMPUTER SYSTE, V20