DETECTION OF AN ANOMALOUS CLUSTER IN A NETWORK

被引:109
作者
Arias-Castro, Ery [1 ]
Candes, Emmanuel J. [2 ,3 ]
Durand, Arnaud [4 ]
机构
[1] Univ Calif San Diego, Dept Math, La Jolla, CA 92093 USA
[2] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[4] Univ Paris 11, Math Lab, UMR8628, F-91405 Orsay, France
基金
美国国家科学基金会;
关键词
Detecting a cluster of nodes in a network; minimax detection; Bayesian detection; scan statistic; generalized likelihood ratio test; disease outbreak detection; sensor networks; Richardson's model; cellular automata; HIGHER CRITICISM; RANDOM GROWTH; SCAN; TRACKING; CLASSIFICATION; SURVEILLANCE; SIGNALS; MODELS;
D O I
10.1214/10-AOS839
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of detecting whether or not, in a given sensor network, there is a cluster of sensors which exhibit an "unusual behavior." Formally, suppose we are given a set of nodes and attach a random variable to each node. We observe a realization of this process and want to decide between the following two hypotheses: under the null, the variables are i.i.d. standard normal; under the alternative, there is a cluster of variables that are i.i.d. normal with positive mean and unit variance, while the rest are i.i.d. standard normal. We also address surveillance settings where each sensor in the network collects information over time. The resulting model is similar, now with a time series attached to each node. We again observe the process over time and want to decide between the null, where all the variables are i.i.d. standard normal, and the alternative, where there is an emerging cluster of i.i.d. normal variables with positive mean and unit variance. The growth models used to represent the emerging cluster are quite general and, in particular, include cellular automata used in modeling epidemics. In both settings, we consider classes of clusters that are quite general, for which we obtain a lower bound on their respective minimax detection rate and show that some form of scan statistic, by far the most popular method in practice, achieves that same rate to within a logarithmic factor. Our results are not limited to the normal location model, but generalize to any one-parameter exponential family when the anomalous clusters are large enough.
引用
收藏
页码:278 / 304
页数:27
相关论文
共 76 条
[1]   ON COMBINATORIAL TESTING PROBLEMS [J].
Addario-Berry, Louigi ;
Broutin, Nicolas ;
Devroye, Luc ;
Lugosi, Gabor .
ANNALS OF STATISTICS, 2010, 38 (05) :3063-3092
[2]  
AGUR S, 1999, EPIDEMIOLOGY CELLULA
[3]   A survey on sensor networks [J].
Akyildiz, IF ;
Su, WL ;
Sankarasubramaniam, Y ;
Cayirci, E .
IEEE COMMUNICATIONS MAGAZINE, 2002, 40 (08) :102-114
[4]  
Aldosari SA, 2004, INT CONF ACOUST SPEE, P277
[5]  
[Anonymous], 2001, Scan Statistics
[6]  
[Anonymous], 2003, OXFORD STUDIES PROBA
[7]  
[Anonymous], 2004, Wireless Sensor Networks, First Edition: An Information Processing Approach
[8]  
[Anonymous], 2005, SPRINGER MG MATH
[9]  
[Anonymous], 2001, CELLULAR AUTOMATA A, DOI DOI 10.1142/4702
[10]   Near-optimal detection of geometric objects by fast multiscale methods [J].
Arias-Castro, E ;
Donoho, DL ;
Huo, XM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (07) :2402-2425