An Information-Theoretical Approach to High-Speed Flow Nature Identification

被引:34
作者
Khakpour, Amir R. [1 ]
Liu, Alex X. [2 ]
机构
[1] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
[2] Nanjing Univ, Dept Comp Sci & Technol, Nanjing 210093, Jiangsu, Peoples R China
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Flow content analysis; flow identification;
D O I
10.1109/TNET.2012.2219591
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper concerns the fundamental problem of identifying the content nature of a flow-namely text, binary, or encrypted-for the first time. We propose Iustitia, a framework for identifying flow nature on the fly. The key observation behind Iustitia is that text flows have the lowest entropy and encrypted flows have the highest entropy, while the entropy of binary flows stands in between. We further extend Iustitia for the finer-grained classification of binary flows so that we can differentiate different types of binary flows (such as image, video, and executables) and even the file formats (such as JPEG and GIF for images, MPEG and AVI for videos) carried by binary flows. The basic idea of Iustitia is to classify flows using machine learning techniques where a feature is the entropy of every certain number of consecutive bytes. Our experimental results show that the classification can be done with high speed and high accuracy. On average, Iustitia can classify flows with 88.27% of accuracy using a buffer size of 1 K with a classification time of less than 10% of packet interarrival time for 91.2% of flows.
引用
收藏
页码:1076 / 1089
页数:14
相关论文
共 30 条
  • [1] Alon N., 1996, Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, P20, DOI 10.1145/237814.237823
  • [2] Can encrypted traffic be identified without port numbers, IP addresses and payload inspection?
    Alshammari, Riyad
    Zincir-Heywood, A. Nur
    [J]. COMPUTER NETWORKS, 2011, 55 (06) : 1326 - 1350
  • [3] [Anonymous], P 15 C USENIX SEC S
  • [4] [Anonymous], 1984, OLSHEN STONE CLASSIF, DOI 10.2307/2530946
  • [5] Dainotti A, 2009, LECT NOTES COMPUT SC, V5537, P64, DOI 10.1007/978-3-642-01645-5_8
  • [6] Dorfinger P, 2011, LECT NOTES COMPUT SC, V6613, P164, DOI 10.1007/978-3-642-20305-3_14
  • [7] Behavioral authentication of server flows
    Early, JP
    Brodley, CE
    Rosenberg, C
    [J]. 19TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE, PROCEEDINGS, 2003, : 46 - 55
  • [8] KISS: Stochastic Packet Inspection Classifier for UDP Traffic
    Finamore, Alessandro
    Mellia, Marco
    Meo, Michela
    Rossi, Dario
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2010, 18 (05) : 1505 - 1515
  • [9] A comparison of methods for multiclass support vector machines
    Hsu, CW
    Lin, CJ
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (02): : 415 - 425
  • [10] Mining anomalies using traffic feature distributions
    Lakhina, A
    Crovella, M
    Diot, C
    [J]. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2005, 35 (04) : 217 - 228