A multi-objective optimisation approach for class imbalance learning

被引:88
作者
Soda, Paolo [1 ]
机构
[1] Integrated Res Ctr, Med Informat & Comp Sci Lab, I-00128 Rome, Italy
关键词
Pattern recognition; Machine learning; Class imbalance learning; Multi-objective optimisation; COMBINATION; STRATEGIES;
D O I
10.1016/j.patcog.2011.01.015
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Class imbalance limits the performance of most learning algorithms since they cannot cope with large differences between the number of samples in each class, resulting in a low predictive accuracy over the minority class. In this respect, several papers proposed algorithms aiming at achieving more balanced performance. However, balancing the recognition accuracies for each class very often harms the global accuracy. Indeed, in these cases the accuracy over the minority class increases while the accuracy over the majority one decreases. This paper proposes an approach to overcome this limitation: for each classification act, it chooses between the output of a classifier trained on the original skewed distribution and the output of a classifier trained according to a learning method addressing the course of imbalanced data. This choice is driven by a parameter whose value maximizes, on a validation set, two objective functions, i.e. the global accuracy and the accuracies for each class. A series of experiments on ten public datasets with different proportions between the majority and minority classes show that the proposed approach provides more balanced recognition accuracies than classifiers trained according to traditional learning methods for imbalanced data as well as larger global accuracy than classifiers trained on the original skewed distribution. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1801 / 1810
页数:10
相关论文
共 27 条
[1]
[Anonymous], ICIAP 07
[2]
[Anonymous], INT JOINT C ART INT
[3]
[Anonymous], 2004, ACM SIGKDD EXPLORATI, DOI DOI 10.1145/1007730.1007737
[4]
[Anonymous], 2004, Mach. Learn.
[5]
[Anonymous], 2007, Uci machine learning repository
[6]
[Anonymous], LECT NOTES COMPUTER
[7]
[Anonymous], MACH LEARN INT WORKS
[8]
New applications of ensembles of classifiers [J].
Barandela, R ;
Sánchez, JS ;
Valdovinos, RM .
PATTERN ANALYSIS AND APPLICATIONS, 2003, 6 (03) :245-256
[9]
Strategies for learning in class imbalance problems [J].
Barandela, R ;
Sánchez, JS ;
García, V ;
Rangel, E .
PATTERN RECOGNITION, 2003, 36 (03) :849-851
[10]
Batista G. E., 2004, ACM SIGKDD Explor. Newslett., P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]