Estimating flow distributions from sampled flow statistics

被引:105
作者
Duffield, N [1 ]
Lund, C [1 ]
Thorup, M [1 ]
机构
[1] AT&T Labs Res, Florham Pk, NJ 07932 USA
关键词
IP flows; maximum likelihood estimation; measurement; measurement errors; packet sampling; sampling methods;
D O I
10.1109/TNET.2005.852874
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Passive traffic measurement increasingly employs sampling at the packet level. Many high-end routers form flow statistics from a sampled substream of packets. Sampling controls the consumption of resources by the measurement operations. However, knowledge of the statistics of flows in the unsampled stream remains useful, for understanding both characteristics of source traffic, and consumption of resources in the network. This paper provides methods that use flow statistics formed from sampled packet stream to infer the frequencies of the number of packets per flow in the unsampled stream. A key task is to infer the properties of flows of original traffic that evaded sampling altogether. We achieve this through statistical inference, and by exploiting protocol level detail reported in flow records. We investigate the impact on our results of different versions of packet sampling.
引用
收藏
页码:933 / 946
页数:14
相关论文
共 18 条
[1]  
[Anonymous], 1984, J R STAT SOC C-APPL
[2]  
[Anonymous], ACM SIGCOMM PITTSB P
[3]  
[Anonymous], 1995, Theory of Statistics
[4]   A PARAMETERIZABLE METHODOLOGY FOR INTERNET TRAFFIC FLOW PROFILING [J].
CLAFFY, KC ;
BRAUN, HW ;
POLYZOS, GC .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1995, 13 (08) :1481-1494
[5]  
CLAFFY KC, 1993, P ACM SIGCOMM, P13
[6]  
COMER D, 1995, INTERNETWORKING TCPT, V1
[7]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[8]  
DUFFIELD N, 2002, ACM SIGCOMM INT MEAS
[9]  
DUFFIELD NG, 2003, ACM SIGCOMM KARLSR G
[10]  
ESTAN C, 2004, ACM SIGCOMM PORTL OR