Gene selection and classification from microarray data using kernel machine

被引:41
作者
Cho, JH
Lee, D
Park, JH
Lee, IB
机构
[1] Pohang Univ Sci & Technol, Dept Chem Engn, Pohang 790784, South Korea
[2] LG Chem Ltd, Chem & Polymer R&D, Yeosu 555280, South Korea
[3] P&I Consulting Co Ltd, Bioinformat Lab, Pohang 790784, South Korea
关键词
gene expression data; gene selection; classification; kernel fisher discriminant analysis;
D O I
10.1016/j.febslet.2004.05.087
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The discrimination of cancer patients (including subtypes) based on gene expression data is a critical problem with clinical ramifications. Central to solving this problem is the issue of how to extract the most relevant genes from the several thousand genes on a typical microarray. Here, we propose a methodology that can effectively select an informative subset of genes and classify the subtypes (or patients) of disease using the selected genes. We employ a kernel machine, kernel Fisher discriminant analysis (KFDA), for discrimination and use the derivatives of the kernel function to perform gene selection. Using a modified form of KFDA in the minimum squared error (MSE) sense and the gradients of the kernel functions, we construct an effective gene selection criterion. We assess the performance of the proposed methodology by applying it to three gene expression datasets: leukemia dataset, breast cancer dataset and colon cancer dataset. Using a few informative genes, the proposed method accurately and reliably classified cancer subtypes (or patients). Also, through a comparison study, we verify the reliability of the gene selection and discrimination results. (C) 2004 Published by Elsevier B.V. on behalf of the Federation of European Biochemical Societies.
引用
收藏
页码:93 / 98
页数:6
相关论文
共 36 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[3]  
ARTHUR DC, 1983, BLOOD, V61, P994
[4]   Tissue classification with gene expression profiles [J].
Ben-Dor, A ;
Bruhn, L ;
Friedman, N ;
Nachman, I ;
Schummer, M ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :559-583
[5]   Gene expression data analysis [J].
Brazma, A ;
Vilo, J .
FEBS LETTERS, 2000, 480 (01) :17-24
[6]   Gene expression profiling:: monitoring transcription and translation products using DNA microarrays and proteomics [J].
Celis, JE ;
Kruhoffer, M ;
Gromova, I ;
Frederiksen, C ;
Ostergaard, M ;
Thykjaer, T ;
Gromov, P ;
Yu, JS ;
Pálsdóttir, H ;
Magnusson, N ;
Orntoft, TF .
FEBS LETTERS, 2000, 480 (01) :2-16
[7]   Choosing multiple parameters for support vector machines [J].
Chapelle, O ;
Vapnik, V ;
Bousquet, O ;
Mukherjee, S .
MACHINE LEARNING, 2002, 46 (1-3) :131-159
[8]   Optimal approach for classification of acute leukemia subtypes based on gene expression data [J].
Cho, JH ;
Lee, D ;
Park, JH ;
Kim, K ;
Lee, IB .
BIOTECHNOLOGY PROGRESS, 2002, 18 (04) :847-854
[9]   New gene selection method for classification of cancer subtypes considering within-class variation [J].
Cho, JH ;
Lee, D ;
Park, JY ;
Lee, IB .
FEBS LETTERS, 2003, 551 (1-3) :3-7
[10]   Identifying marker genes in transcription profiling data using a mixture of feature relevance experts [J].
Chow, ML ;
Moler, EJ ;
Mian, IS .
PHYSIOLOGICAL GENOMICS, 2001, 5 (02) :99-111