A two-stage evolutionary algorithm based on sensitivity and accuracy for multi-class problems

被引:9
作者
Antonio Gutierrez, Pedro [1 ]
Hervas-Martinez, Cesar [1 ]
Jose Martinez-Estudillo, Francisco [2 ]
Carbonero, Mariano [2 ]
机构
[1] Univ Cordoba, Dept Comp Sci & Numer Anal, E-14071 Cordoba, Spain
[2] ETEA, Dept Management & Quantitat Methods, Cordoba 14005, Spain
关键词
Classification; Multi-class; Sensitivity; Accuracy; Two-stage evolutionary algorithm; Imbalanced datasets; MULTILOGISTIC REGRESSION; NEURAL-NETWORK; OPTIMIZATION;
D O I
10.1016/j.ins.2012.02.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The machine learning community has traditionally used correct classification rates or accuracy (C) values to measure classifier performance and has generally avoided presenting classification levels of each class in the results, especially for problems with more than two classes. C values alone are insufficient because they cannot capture the myriad of contributing factors that differentiate the performance of two different classifiers. Receiver Operating Characteristic (ROC) analysis is an alternative to solve these difficulties, but it can only be used for two-class problems. For this reason, this paper proposes a new approach for analysing classifiers based on two measures: C and sensitivity (S) (i.e., the minimum of accuracies obtained for each class). These measures are optimised through a two-stage evolutionary process. It was conducted by applying two sequential fitness functions in the evolutionary process, including entropy (E) for the first stage and a new fitness function, area (A), for the second stage. By using these fitness functions, the C level was optimised in the first stage, and the S value of the classifier was generally improved without significantly reducing C in the second stage. This two-stage approach improved S values in the generalisation set (whereas an evolutionary algorithm (EA) based only on the S measure obtains worse S levels) and obtained both high C values and good classification levels for each class. The methodology was applied to solve 16 benchmark classification problems and two complex real-world problems in analytical chemistry and predictive microbiology. It obtained promising results when compared to other competitive multiclass classification algorithms and a multi-objective alternative based on E and S. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:20 / 37
页数:18
相关论文
共 40 条
[1]   An evolutionary artificial neural networks approach for breast cancer diagnosis [J].
Abbass, HA .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2002, 25 (03) :265-281
[2]   KEEL: a software tool to assess evolutionary algorithms for data mining problems [J].
Alcala-Fdez, J. ;
Sanchez, L. ;
Garcia, S. ;
del Jesus, M. J. ;
Ventura, S. ;
Garrell, J. M. ;
Otero, J. ;
Romero, C. ;
Bacardit, J. ;
Rivas, V. M. ;
Fernandez, J. C. ;
Herrera, F. .
SOFT COMPUTING, 2009, 13 (03) :307-318
[3]   AN EVOLUTIONARY ALGORITHM THAT CONSTRUCTS RECURRENT NEURAL NETWORKS [J].
ANGELINE, PJ ;
SAUNDERS, GM ;
POLLACK, JB .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (01) :54-65
[4]  
[Anonymous], 1987, Multiple comparison procedures
[5]  
[Anonymous], 2007, Uci machine learning repository
[6]  
[Anonymous], 2006, Pattern recognition and machine learning
[7]   Logistic Regression by Means of Evolutionary Radial Basis Function Neural Networks [J].
Antonio Gutierrez, Pedro ;
Hervas-Martinez, Cesar ;
Martinez-Estudillo, Francisco J. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (02) :246-263
[8]   Comparing evolutionary hybrid systems for design and optimization of multilayer perceptron structure along training parameters [J].
Castillo, P. A. ;
Merelo, J. J. ;
Arenas, M. G. ;
Romero, G. .
INFORMATION SCIENCES, 2007, 177 (14) :2884-2905
[9]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[10]   Information granulation based data mining approach for classifying imbalanced data [J].
Chen, Mu-Chen ;
Chen, Long-Sheng ;
Hsu, Chun-Chin ;
Zeng, Wei-Rong .
INFORMATION SCIENCES, 2008, 178 (16) :3214-3227