Identifying the most significant genes from gene expression profiles for sample classification

被引:3
作者
Al-Mubaid, Hisham [1 ]
Ghaffari, Noushin [1 ]
机构
[1] Univ Houston Clear Lake, Dept Comp Sci, Houston, TX 77058 USA
来源
2006 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING | 2006年
关键词
bioinformatics; gene selection; gene classification;
D O I
10.1109/GRC.2006.1635887
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
The gene expression data generated by the Microarray technology for thousands of genes simultaneously provide huge amounts of biomedical data in forms of gene expression profiles. This generated gene data include complex variations of expression levels of thousands of gene in the classes of samples. The gene level variations allow for classifying and clustering the samples based on only a small subset of genes. In this work, we want to identify the most significant genes that demonstrate the highest capabilities of discrimination between the classes of samples. We present a new gene selection technique for extracting the most significant genes from the huge gene/feature space in a given gene expression dataset. Our method is based on computing the discriminating capability of each gene, and classifying the data according to only those most significant genes that have highest discriminating capabilities. We also adapted from text categorization and information retrieval five feature selection techniques into the gene selection task to compare with our method. We evaluated the method using four well-known gene expression datasets. The experimental results showed that our method produces impressive and competitive results in terms of classification performance with few selected genes compared with the existing techniques.
引用
收藏
页码:655 / +
页数:2
相关论文
共 12 条
[1]
Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]
Boser BE, 1992, P 5 ANN WORKSH COMP, P144, DOI [10.1145/130385.130401, DOI 10.1145/130385.130401]
[3]
GHAFFARI N, 2006, IN PRESS P CATA06
[4]
Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[5]
Gordon GJ, 2002, CANCER RES, V62, P4963
[6]
HOW BC, 2004, P IEEE WIC ACM INT C
[7]
A combinational feature selection and ensemble neural network method for classification of gene expression data [J].
Liu, B ;
Cui, QH ;
Jiang, TZ ;
Ma, SD .
BMC BIOINFORMATICS, 2004, 5 (1)
[8]
NEWTON J, 606 CMPUT
[9]
Multi-class cancer classification via partial least squares with gene expression profiles [J].
Nguyen, DV ;
Rocke, DM .
BIOINFORMATICS, 2002, 18 (09) :1216-1226
[10]
PAUL TK, 2005, GECCO 05