A variable selection method based on Tabu search for logistic regression models

被引:59
作者
Pacheco, Joaquin [1 ]
Casado, Silvia [1 ]
Nunez, Laura [2 ]
机构
[1] Univ Burgos, Dept Appl Econ, Burgos, Spain
[2] Sch Business, Inst Empresa, Dept Finance, Madrid, Spain
关键词
Variable selection; Logistic regression; Metaheuristics; Tabu search; FEATURE SUBSET-SELECTION; DISTRIBUTION ALGORITHMS; SURVIVAL; NETWORK; BRANCH;
D O I
10.1016/j.ejor.2008.10.007
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
A Tabu search method is proposed and analysed for selecting variables that are subsequently used in Logistic Regression Models. The aim is to find from among a set of m variables a smaller subset which enables the efficient classification of cases. Reducing dimensionality has some very well-known advantages that are summarized in literature. The specific problem consists in finding, for a small integer value of p, a subset of size p of the original set of variables that yields the greatest percentage of hits in Logistic Regression. The proposed Tabu search method performs a deep search in the solution space that alternates between a basic phase (that uses simple moves) and a diversification phase (to explore regions not previously visited). Testing shows that it obtains significantly better results than the Stepwise, Backward or Forward methods used by classic statistical packages. Some results of applying these methods are presented. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:506 / 511
页数:6
相关论文
共 43 条
[31]  
NARENDRA P, 1977, IEEE T COMPUT, V26, P917, DOI 10.1109/TC.1977.1674939
[32]   ON THE EFFICACY OF THE RANK TRANSFORMATION IN STEPWISE LOGISTIC AND DISCRIMINANT-ANALYSIS [J].
OGORMAN, TW ;
WOOLSON, RF .
STATISTICS IN MEDICINE, 1993, 12 (02) :143-151
[33]   Analysis of new variable selection methods for discriminant analysis [J].
Pacheco, Joaquin ;
Casado, Silvia ;
Nunez, Laura ;
Gomez, Olga .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (03) :1463-1478
[34]  
PRIETOCASTELLAN.KA, 2005, THESIS U PUERTO RICO
[35]  
SALVADOR M, 2000, ANALISIS DISCRIMINAN
[36]  
Sebestyen G. S., 1962, Decision-making processes in pattern recognition
[37]   A simple and efficient algorithm for gene selection using sparse logistic regression [J].
Shevade, SK ;
Keerthi, SS .
BIOINFORMATICS, 2003, 19 (17) :2246-2253
[38]  
Shi SYM, 2003, LECT NOTES COMPUT SC, V2714, P1151
[39]  
Sierra B, 2001, LECT NOTES ARTIF INT, V2101, P20
[40]   Gene-expression profile changes correlated with tumor progression and lymph node metastasis in Esophageal cancer [J].
Tamoto, E ;
Tada, M ;
Murakawa, K ;
Takada, M ;
Shindo, G ;
Teramoto, K ;
Matsunaga, A ;
Komuro, K ;
Kanai, M ;
Kawakami, A ;
Fujiwara, Y ;
Kobayashi, N ;
Shirata, K ;
Nishimura, N ;
Okushiba, SI ;
Kondo, S ;
Hamada, J ;
Yoshiki, T ;
Moriuchi, T ;
Katoh, H .
CLINICAL CANCER RESEARCH, 2004, 10 (11) :3629-3638