Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study

被引:236
作者
Cano, JR [1 ]
Herrera, F
Lozano, M
机构
[1] Univ Huelva, Dept Elect Engn Comp Syst & Automat, Escuela Super La Rabida, Huelva 21819, Spain
[2] Univ Granada, Dept Comp Sci & Artificial Intelligence, E-18071 Granada, Spain
关键词
data mining (DM); data reduction; evolutionary algorithms (EAs); instance selection; knowledge discovery;
D O I
10.1109/TEVC.2003.819265
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Evolutionary algorithms are adaptive, methods based on natural evolution that may be used for search and optimization. As data reduction in knowledge discovery in databases (KDDs) can be viewed as a search problem, it could be solved using evolutionary algorithms (EAs). In this paper, we have carried out an empirical study of the performance of four representative EA models in which we have taken into account two different instance selection perspectives, the prototype selection and the training set selection for data reduction in KDD. This paper includes a comparison between these algorithms and other nonevolutionary instance selection algorithms. The results show that the evolutionary instance selection algorithms consistently outperform the nonevolutionary ones, the main advantages being: better instance reduction rates, higher classification accuracy, and models that are easier to interpret.
引用
收藏
页码:561 / 575
页数:15
相关论文
共 35 条
[1]  
ADRIAANS P, 1996, DATA MINING
[2]  
AHA DW, 1991, MACH LEARN, V6, P37, DOI 10.1007/BF00153759
[3]  
Back T., 1997, HDB EVOLUTIONARY COM
[4]  
Baluja S., 1994, POPULATION BASED INC, DOI [10.5555/865123, DOI 10.5555/865123]
[5]   Advances in instance selection for instance-based learning algorithms [J].
Brighton, H ;
Mellish, C .
DATA MINING AND KNOWLEDGE DISCOVERY, 2002, 6 (02) :153-172
[6]  
Brighton H, 2001, SPRING INT SER ENG C, V608, P77
[7]  
Brodley C.E., 1993, P 10 INT C MACH LEAR, P17
[8]  
Chapman P., 1999, CRISP DM PROCESS MOD
[9]  
Devijver P.A., 1982, PATTERN RECOGNITION
[10]  
Eshelman LJ, 1991, FDN GENETIC ALGORITH, P265, DOI DOI 10.1016/B978-0-08-050684-5.50020-3