Classification of Phishing Email Using Random Forest Machine Learning Technique

被引:81
作者
Akinyelu, Andronicus A. [1 ]
Adewumi, Aderemi O. [1 ]
机构
[1] Univ KwaZulu Natal, Sch Math Stat & Comp Sci, ZA-4000 Durban, South Africa
关键词
D O I
10.1155/2014/425731
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars have been lost by many companies and individuals. In 2012, an online report put the loss due to phishing attack at about $1.5 billion. This global impact of phishing attacks will continue to be on the increase and thus requires more efficient phishing detection techniques to curb the menace. This paper investigates and reports the use of random forest machine learning algorithm in classification of phishing attacks, with the major objective of developing an improved phishing email classifier with better prediction accuracy and fewer numbers of features. From a dataset consisting of 2000 phishing and ham emails, a set of prominent phishing email features (identified from the literature) were extracted and used by the machine learning algorithm with a resulting classification accuracy of 99.7% and low false negative (FN) and false positive (FP) rates.
引用
收藏
页数:6
相关论文
共 23 条
  • [1] Albrecht K., 2005, P 2 C EM ANT CEAS 05
  • [2] Almomani Ammar, 2012, Journal of Computer Science, V8, P1099, DOI 10.3844/jcssp.2012.1099.1107
  • [3] [Anonymous], 2010, NDSS 10
  • [4] [Anonymous], P 14 ANN NETW DISTR
  • [5] [Anonymous], 2007, P 16 INT C WORLD WID, DOI DOI 10.1145/1242572.1242660
  • [6] Apache Software Foundation, 2006, SPAM ASS HOM
  • [7] Basnet R, 2008, STUD FUZZ SOFT COMP, V226, P373, DOI 10.1007/978-3-540-77465-5_19
  • [8] Nature-Inspired Techniques in the Context of Fraud Detection
    Behdad, Mohammad
    Barone, Luigi
    Bennamoun, Mohammed
    French, Tim
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06): : 1273 - 1290
  • [9] New filtering approaches for phishing email
    Bergholz, Andre
    De Beer, Jan
    Glahn, Sebastian
    Moens, Marie-Francine
    Paass, Gerhard
    Strobel, Siehyun
    [J]. JOURNAL OF COMPUTER SECURITY, 2010, 18 (01) : 7 - 35
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32