Analysis of new variable selection methods for discriminant analysis

被引:27
作者
Pacheco, Joaquin
Casado, Silvia
Nunez, Laura
Gomez, Olga
机构
[1] Unv Burgos, Dept Appl Econ, Burgos, Spain
[2] Inst Empresa, Sch Business, Dept Finance, Madrid, Spain
关键词
variable selection; discriminant analysis; metaheuristics; GRASP; memetics; VNS; Tabu search;
D O I
10.1016/j.csda.2006.04.019
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Several methods to select variables that are subsequently used in discriminant analysis are proposed and analysed. The aim is to find from among a set of m variables a smaller subset which enables an efficient classification of cases. Reducing dimensionality has some advantages such as reducing the costs of data acquisition, better understanding of the final classification model, and an increase in the efficiency and efficacy of the model itself. The specific problem consists in finding, for a small integer value of p, the size p subset of original variables that yields the greatest percentage of hits in the discriminant analysis. To solve this problem a series of techniques based on metaheuristic strategies is proposed. After performing some test it is found that they obtain significantly better results than the stepwise, backward or forward methods used by classic statistical packages. The way these methods work is illustrated with several examples. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:1463 / 1478
页数:16
相关论文
共 43 条
[1]  
[Anonymous], 2002, HDB APPL OPTIMIZATIO
[2]  
[Anonymous], 1997, Tabu Search
[3]  
[Anonymous], 2002, HDB APPL OPTIMIZATIO
[4]  
Balashov YA, 1996, PETROLOGY, V4, P1
[5]  
Cotta C, 2004, LECT NOTES COMPUT SC, V3005, P21
[6]  
Efroymson M.A., 1960, MATH METHODS DIGITAL, V1
[7]   GREEDY RANDOMIZED ADAPTIVE SEARCH PROCEDURES [J].
FEO, TA ;
RESENDE, MGC .
JOURNAL OF GLOBAL OPTIMIZATION, 1995, 6 (02) :109-133
[8]   A PROBABILISTIC HEURISTIC FOR A COMPUTATIONALLY DIFFICULT SET COVERING PROBLEM [J].
FEO, TA ;
RESENDE, MGC .
OPERATIONS RESEARCH LETTERS, 1989, 8 (02) :67-71
[9]   Automated melanoma recognition [J].
Ganster, H ;
Pinz, A ;
Röhrer, R ;
Wildling, E ;
Binder, M ;
Kittler, H .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2001, 20 (03) :233-239
[10]   Branch-and-bound algorithms for computing the best-subset regression models [J].
Gatu, C ;
Kontoghiorghes, EJ .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2006, 15 (01) :139-156