Optimal approach for classification of acute leukemia subtypes based on gene expression data

被引:20
作者
Cho, JH
Lee, D
Park, JH
Kim, K
Lee, IB
机构
[1] Pohang Univ Sci & Technol, Dept Chem Engn, Pohang 790784, South Korea
[2] P&I Consulting Co Ltd, Pohang 790784, South Korea
[3] Jae I1 Hosp, Youngduk 766845, Kyungbook, South Korea
关键词
D O I
10.1021/bp025517o
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The classification of cancer subtypes, which is critical for successful treatment, has been studied extensively with the use of gene expression profiles from oligonucleotide chips or cDNA microarrays. Various pattern recognition methods have been successfully applied to gene expression data. However, these methods are not optimal, rather they are high-performance classifiers that emphasize only classification accuracy. In this paper, we propose an approach for the construction of the optimal linear classifier using gene expression data. Two linear classification methods, linear discriminant analysis (LDA) and discriminant partial least-squares (DPLS), are applied to distinguish acute leukemia subtypes. These methods are shown to give satisfactory accuracy. Moreover, we determined optimally the number of genes participating in the classification (a remarkably small number compared to previous results) on the basis of the statistical significance test. Thus, the proposed method constructs the optimal classifier that is composed of a small size predictor and provides high accuracy.
引用
收藏
页码:847 / 854
页数:8
相关论文
共 32 条
[21]  
Ljung L, 1987, SYSTEM IDENTIFICATIO
[22]  
Magyarosy E, 2001, ANTICANCER RES, V21, P819
[23]   Machine learning for science: State of the art and future prospects [J].
Mjolsness, E ;
DeCoste, D .
SCIENCE, 2001, 293 (5537) :2051-+
[24]   Tumor classification by partial least squares using microarray gene expression data [J].
Nguyen, DV ;
Rocke, DM .
BIOINFORMATICS, 2002, 18 (01) :39-50
[25]   Plasma levels of the differentiation inhibitory factor nm23-H1 protein and their clinical implications in acute myelogenous leukemia [J].
Niitsu, N ;
Okabe-Kado, J ;
Nakayama, M ;
Wakimoto, N ;
Sakashita, A ;
Maseki, N ;
Motoyoshi, K ;
Umeda, M ;
Honma, Y .
BLOOD, 2000, 96 (03) :1080-1086
[26]   Differentiation inhibitory factor nm23 as a prognostic factor for acute myeloid leukemia [J].
Okabe-Kado, J ;
Kasukabe, T ;
Honma, Y .
LEUKEMIA & LYMPHOMA, 1998, 32 (1-2) :19-28
[27]   A gene expression database for the molecular pharmacology of cancer [J].
Scherf, U ;
Ross, DT ;
Waltham, M ;
Smith, LH ;
Lee, JK ;
Tanabe, L ;
Kohn, KW ;
Reinhold, WC ;
Myers, TG ;
Andrews, DT ;
Scudiero, DA ;
Eisen, MB ;
Sausville, EA ;
Pommier, Y ;
Botstein, D ;
Brown, PO ;
Weinstein, JN .
NATURE GENETICS, 2000, 24 (03) :236-244
[28]  
Sharma S., 1996, APPL MULTIVARIATE TE
[29]   Analysis of large-scale gene expression data [J].
Sherlock, G .
CURRENT OPINION IN IMMUNOLOGY, 2000, 12 (02) :201-205
[30]   Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling [J].
Toh, H ;
Horimoto, K .
BIOINFORMATICS, 2002, 18 (02) :287-297