A combined SMOTE and PSO based RBF classifier for two-class imbalanced problems

被引:110
作者
Gao, Ming [1 ]
Hong, Xia [1 ]
Chen, Sheng [2 ]
Harris, Chris J. [2 ]
机构
[1] Univ Reading, Sch Syst Engn, Reading RG6 6AY, Berks, England
[2] Univ Southampton, Sch Elect & Comp Sci, Southampton SO17 1BJ, Hants, England
基金
英国工程与自然科学研究理事会;
关键词
Imbalanced classification; Synthetic minority over-sampling technique; Radial basis function classifier; Orthogonal forward selection; Particle swarm optimisation; PARTICLE SWARM OPTIMIZATION; ALGORITHM; REGRESSION; SELECTION; NETWORKS;
D O I
10.1016/j.neucom.2011.06.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This contribution proposes a powerful technique for two-class imbalanced classification problems by combining the synthetic minority over-sampling technique (SMOTE) and the particle swarm optimisation (PSO) aided radial basis function (RBF) classifier. In order to enhance the significance of the small and specific region belonging to the positive class in the decision region, the SMOTE is applied to generate synthetic instances for the positive class to balance the training data set. Based on the over-sampled training data, the RBF classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier's structure and the parameters of RBF kernels are determined using a PSO algorithm based on the criterion of minimising the leave-one-out misclassification rate. The experimental results obtained on a simulated imbalanced data set and three real imbalanced data sets are presented to demonstrate the effectiveness of our proposed algorithm. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:3456 / 3466
页数:11
相关论文
共 57 条
[1]   Applying support vector machines to imbalanced datasets [J].
Akbani, R ;
Kwek, S ;
Japkowicz, N .
MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 :39-50
[2]  
[Anonymous], 1990, Classical and modern regression with applications
[3]   Strategies for learning in class imbalance problems [J].
Barandela, R ;
Sánchez, JS ;
García, V ;
Rangel, E .
PATTERN RECOGNITION, 2003, 36 (03) :849-851
[4]  
Batista G. E., 2004, ACM SIGKDD Explor. Newslett., P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
[5]  
Blake C. L., 1998, Uci repository of machine learning databases
[6]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[7]   Automatically countering imbalance and its empirical relationship to cost [J].
Chawla, Nitesh V. ;
Cieslak, David A. ;
Hall, Lawrence O. ;
Joshi, Ajay .
DATA MINING AND KNOWLEDGE DISCOVERY, 2008, 17 (02) :225-252
[8]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]   Kernel classifier construction using orthogonal forward selection and boosting with Fisher ratio class separability measure [J].
Chen, S. ;
Wang, X. X. ;
Hong, X. ;
Harris, C. J. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2006, 17 (06) :1652-1656
[10]   Experiments with repeating weighted boosting search for optimization in signal processing applications [J].
Chen, S ;
Wang, XX ;
Harris, CJ .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2005, 35 (04) :682-693