DETECTION OF AN ANOMALOUS CLUSTER IN A NETWORK

被引:109
作者
Arias-Castro, Ery [1 ]
Candes, Emmanuel J. [2 ,3 ]
Durand, Arnaud [4 ]
机构
[1] Univ Calif San Diego, Dept Math, La Jolla, CA 92093 USA
[2] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[4] Univ Paris 11, Math Lab, UMR8628, F-91405 Orsay, France
基金
美国国家科学基金会;
关键词
Detecting a cluster of nodes in a network; minimax detection; Bayesian detection; scan statistic; generalized likelihood ratio test; disease outbreak detection; sensor networks; Richardson's model; cellular automata; HIGHER CRITICISM; RANDOM GROWTH; SCAN; TRACKING; CLASSIFICATION; SURVEILLANCE; SIGNALS; MODELS;
D O I
10.1214/10-AOS839
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of detecting whether or not, in a given sensor network, there is a cluster of sensors which exhibit an "unusual behavior." Formally, suppose we are given a set of nodes and attach a random variable to each node. We observe a realization of this process and want to decide between the following two hypotheses: under the null, the variables are i.i.d. standard normal; under the alternative, there is a cluster of variables that are i.i.d. normal with positive mean and unit variance, while the rest are i.i.d. standard normal. We also address surveillance settings where each sensor in the network collects information over time. The resulting model is similar, now with a time series attached to each node. We again observe the process over time and want to decide between the null, where all the variables are i.i.d. standard normal, and the alternative, where there is an emerging cluster of i.i.d. normal variables with positive mean and unit variance. The growth models used to represent the emerging cluster are quite general and, in particular, include cellular automata used in modeling epidemics. In both settings, we consider classes of clusters that are quite general, for which we obtain a lower bound on their respective minimax detection rate and show that some form of scan statistic, by far the most popular method in practice, achieves that same rate to within a logarithmic factor. Our results are not limited to the normal location model, but generalize to any one-parameter exponential family when the anomalous clusters are large enough.
引用
收藏
页码:278 / 304
页数:27
相关论文
共 76 条
[61]   Fire detection and growth monitoring using a multitemporal technique on AVHRR mid-infrared and thermal channels [J].
Pozo, D ;
Olmo, FJ ;
AladosArboledas, L .
REMOTE SENSING OF ENVIRONMENT, 1997, 60 (02) :111-120
[62]  
RICHARDSON D, 1973, P CAMB PHILOS SOC, V74, P515
[63]   Advances in detecting and responding to threats from bioterrorism and emerging infectious disease [J].
Rotz, LD ;
Hughes, JM .
NATURE MEDICINE, 2004, 10 (12) :S130-S136
[64]  
Schiff Joel., 2008, Cellular Automata
[65]   Multiple hypothesis mapping of functional MRI data in orthogonal and complex wavelet domains [J].
Sendur, L ;
Maxim, V ;
Whitcher, B ;
Bullmore, ET .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2005, 53 (09) :3413-3426
[66]   Nonparametric hypothesis testing for a spatial signal [J].
Shen, XT ;
Huang, HC ;
Cressie, N .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (460) :1122-1140
[67]  
SIEGMUND D., 1985, SEQUENTIAL ANAL TEST
[68]   Wavelet transform methods for object detection and recovery [J].
Strickland, RN ;
IlHahn, H .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 1997, 6 (05) :724-735
[69]  
Szor P, 2005, ART COMPUTER VIRUS R
[70]  
Tan HC, 2006, LECT NOTES COMPUT SC, V3852, P663