Logic classification and feature selection for biomedical data

被引:24
作者
Bertolazzi, P. [1 ]
Felici, G. [1 ]
Festa, P. [2 ]
Lancia, G. [3 ]
机构
[1] CNR, Ist Anal Sistemi & Informat Antonio Ruberti, I-00185 Rome, Italy
[2] Univ Naples Federico II, Dipartimento Matemat & Applicaz RM Caccioppoli, Naples, Italy
[3] Univ Udine, Dipartimento Matemat & Informat, I-33100 Udine, Italy
关键词
logic data mining; combinatorial feature selection; set covering;
D O I
10.1016/j.camwa.2006.12.093
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper we investigate logic classification and related feature selection algorithms for large biomedical data sets. When the data is in binary/logic form, the feature selection problem can be formulated as a Set Covering problem of very large dimensions, whose solution is computationally challenging. We propose an alternative approximated formulation for feature selection that results in an extension of Set Covering of compact size, and use the logic classifier Lsquare to test its performances on two wellknown data sets. An ad hoc metaheuristic of the GRASP type is used to solve efficiently the feature selection problem. A simple and effective method to convert rational data into logic data by interval mapping is also described. The computational results obtained are promising and the use of logic models, that can be easily understood and integrated with other domain knowledge, is one of the major strengths of this approach. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:889 / 899
页数:11
相关论文
共 29 条
[1]  
[Anonymous], SINGLE NUCLEOTIDE PO
[2]  
[Anonymous], ENCY DATA WAREHOUSIN
[3]  
[Anonymous], PROC FALL SYMP REL
[4]  
BONIZZONI P, 2005, UNPUB PRACTICAL APPR
[5]   Logical analysis of binary data with missing bits [J].
Boros, E ;
Ibaraki, T ;
Makino, K .
ARTIFICIAL INTELLIGENCE, 1999, 107 (02) :219-263
[6]  
BOROS E, 1996, IMPLEMENTATION LOGIC, P29
[7]  
BREIMAN F, 1984, OLSHEN STONE CLASSIF
[8]   Prediction of Saccharomyces cerevisiae protein functional class from functional domain composition [J].
Cai, YD ;
Doig, AJ .
BIOINFORMATICS, 2004, 20 (08) :1292-1300
[9]  
Carey M., 1979, COMPUTER INTRACTABIL
[10]   Identifying marker genes in transcription profiling data using a mixture of feature relevance experts [J].
Chow, ML ;
Moler, EJ ;
Mian, IS .
PHYSIOLOGICAL GENOMICS, 2001, 5 (02) :99-111