A variable selection method based on Tabu search for logistic regression models

被引:59
作者
Pacheco, Joaquin [1 ]
Casado, Silvia [1 ]
Nunez, Laura [2 ]
机构
[1] Univ Burgos, Dept Appl Econ, Burgos, Spain
[2] Sch Business, Inst Empresa, Dept Finance, Madrid, Spain
关键词
Variable selection; Logistic regression; Metaheuristics; Tabu search; FEATURE SUBSET-SELECTION; DISTRIBUTION ALGORITHMS; SURVIVAL; NETWORK; BRANCH;
D O I
10.1016/j.ejor.2008.10.007
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
A Tabu search method is proposed and analysed for selecting variables that are subsequently used in Logistic Regression Models. The aim is to find from among a set of m variables a smaller subset which enables the efficient classification of cases. Reducing dimensionality has some very well-known advantages that are summarized in literature. The specific problem consists in finding, for a small integer value of p, a subset of size p of the original set of variables that yields the greatest percentage of hits in Logistic Regression. The proposed Tabu search method performs a deep search in the solution space that alternates between a basic phase (that uses simple moves) and a diversification phase (to explore regions not previously visited). Testing shows that it obtains significantly better results than the Stepwise, Backward or Forward methods used by classic statistical packages. Some results of applying these methods are presented. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:506 / 511
页数:6
相关论文
共 43 条
[1]  
[Anonymous], 2003, ARTIF INTELL
[2]   Using Learning to Facilitate the Evolution of Features for Recognizing Visual Concepts [J].
Bala, J. ;
De Jong, K. ;
Huang, J. ;
Vafaie, H. ;
Wechsler, H. .
EVOLUTIONARY COMPUTATION, 1996, 4 (03) :297-311
[3]  
CESSIC N, 1992, APPL STAT, V41, P191
[4]  
Cotta C, 2004, LECT NOTES COMPUT SC, V3005, P21
[5]  
Dixon W., 1988, BMDP Statistical Software Manual: To Accompany the 1988 Software Release
[6]  
Efroymson M.A., 1960, MATH METHODS DIGITAL, V1
[7]   Classification of microarray data with penalized logistic regression [J].
Eilers, PHC ;
Boer, JM ;
van Ommen, GJ ;
van Houwelingen, HC .
MICROARRAYS: OPTICAL TECHNOLOGIES AND INFORMATICS, 2001, 4266 :187-198
[8]   Automated melanoma recognition [J].
Ganster, H ;
Pinz, A ;
Röhrer, R ;
Wildling, E ;
Binder, M ;
Kittler, H .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2001, 20 (03) :233-239
[9]   Branch-and-bound algorithms for computing the best-subset regression models [J].
Gatu, C ;
Kontoghiorghes, EJ .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2006, 15 (01) :139-156
[10]   Parallel algorithms for computing all possible subset regression models using the QR decomposition [J].
Gatu, C ;
Kontoghiorghes, EJ .
PARALLEL COMPUTING, 2003, 29 (04) :505-521