New gene selection method for classification of cancer subtypes considering within-class variation

被引:55
作者
Cho, JH
Lee, D
Park, JY
Lee, IB
机构
[1] Pohang Univ Sci & Technol, Dept Chem Engn, Pohang 790784, South Korea
[2] P&I Consulting Co Ltd, Pohang 790784, South Korea
关键词
gene expression data; gene selection; classification; centroid; within-class variation; Kernel Fisher's discriminant analysis;
D O I
10.1016/S0014-5793(03)00819-6
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this work we propose a new method for finding gene subsets of microarray data that effectively discriminates subtypes of disease. We developed a new criterion for measuring the relevance of individual genes by using mean and standard deviation of distances from each sample to the class centroid in order to treat the well-known problem of gene selection, large within-class variation. Also this approach has the advantage that it is applicable not only to binary classification but also to multiple classification problems. We demonstrated the performance of the method by applying it to the publicly available microarray datasets, leukemia (two classes) and small round blue cell tumors (four classes). The proposed method provides a very small number of genes compared with the previous methods without loss of discriminating power and thus it can effectively facilitate further biological and clinical researches. (C) 2003 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:3 / 7
页数:5
相关论文
共 20 条
[1]   Involvement of Wiskott-Aldrich syndrome protein in B-Cell cytoplasmic tyrosine kinase pathway [J].
Baba, Y ;
Nonoyama, S ;
Matsushita, M ;
Yamadori, T ;
Hashimoto, S ;
Imai, K ;
Arai, S ;
Kunikata, T ;
Kurimoto, M ;
Kurosaki, T ;
Ochs, HD ;
Yata, J ;
Kishimoto, T ;
Tsukada, S .
BLOOD, 1999, 93 (06) :2003-2012
[2]   Tissue classification with gene expression profiles [J].
Ben-Dor, A ;
Bruhn, L ;
Friedman, N ;
Nachman, I ;
Schummer, M ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :559-583
[3]   Pattern identification and classification in gene expression data using an autoassociative neural network model [J].
Bicciato, S ;
Pandin, M ;
Didonè, G ;
Di Bello, C .
BIOTECHNOLOGY AND BIOENGINEERING, 2003, 81 (05) :594-606
[4]   Optimal approach for classification of acute leukemia subtypes based on gene expression data [J].
Cho, JH ;
Lee, D ;
Park, JH ;
Kim, K ;
Lee, IB .
BIOTECHNOLOGY PROGRESS, 2002, 18 (04) :847-854
[5]  
Cory GOC, 1996, J IMMUNOL, V157, P3791
[6]   NON-HODGKIN LYMPHOMA IN COMMON VARIABLE IMMUNODEFICIENCY [J].
CUNNINGHAMRUNDLES, C ;
LIEBERMAN, P ;
HELLMAN, G ;
CHAGANTI, RSK .
AMERICAN JOURNAL OF HEMATOLOGY, 1991, 37 (02) :69-74
[7]  
Dudoit S, 2002, STAT SINICA, V12, P111
[8]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[9]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[10]  
Hart, 2006, PATTERN CLASSIFICATI