Logic classification and feature selection for biomedical data

被引:24
作者
Bertolazzi, P. [1 ]
Felici, G. [1 ]
Festa, P. [2 ]
Lancia, G. [3 ]
机构
[1] CNR, Ist Anal Sistemi & Informat Antonio Ruberti, I-00185 Rome, Italy
[2] Univ Naples Federico II, Dipartimento Matemat & Applicaz RM Caccioppoli, Naples, Italy
[3] Univ Udine, Dipartimento Matemat & Informat, I-33100 Udine, Italy
关键词
logic data mining; combinatorial feature selection; set covering;
D O I
10.1016/j.camwa.2006.12.093
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper we investigate logic classification and related feature selection algorithms for large biomedical data sets. When the data is in binary/logic form, the feature selection problem can be formulated as a Set Covering problem of very large dimensions, whose solution is computationally challenging. We propose an alternative approximated formulation for feature selection that results in an extension of Set Covering of compact size, and use the logic classifier Lsquare to test its performances on two wellknown data sets. An ad hoc metaheuristic of the GRASP type is used to solve efficiently the feature selection problem. A simple and effective method to convert rational data into logic data by interval mapping is also described. The computational results obtained are promising and the use of logic models, that can be easily understood and integrated with other domain knowledge, is one of the major strengths of this approach. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:889 / 899
页数:11
相关论文
共 29 条
[11]  
de Angelis V, 2006, DATA MINING AND KNOWLEDGE DISCOVERY APPROACHES BASED ON RULE INDUCTION TECHNIQUES, P227
[12]  
DEBONTRIDDER KMJ, 2002, ESA 2002, P737
[13]  
DING C, 2003, CSB 03
[14]  
Edwards JW, 2005, DNA MICROARRAYS RELA
[15]  
Felici G, 2001, INFORMS J COMPUT, V13, P1
[16]   GREEDY RANDOMIZED ADAPTIVE SEARCH PROCEDURES [J].
FEO, TA ;
RESENDE, MGC .
JOURNAL OF GLOBAL OPTIMIZATION, 1995, 6 (02) :109-133
[17]   A PROBABILISTIC HEURISTIC FOR A COMPUTATIONALLY DIFFICULT SET COVERING PROBLEM [J].
FEO, TA ;
RESENDE, MGC .
OPERATIONS RESEARCH LETTERS, 1989, 8 (02) :67-71
[18]  
Festa P, 2002, OPER RES COMPUT SCI, V15, P325
[19]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[20]  
HATZIS C, 2001, KDD 2001 CUP GENOMIC