On partial least squares dimension reduction for microarray-based classification: a simulation study

被引:53
作者
Nguyen, DV
Rocke, DM
机构
[1] Univ Calif Davis, Dept Epidemiol & Prevent Med, Sch Med, Div Biostat, Davis, CA 95616 USA
[2] Univ Calif Davis, Dept Appl Sci, Davis, CA 95616 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
DNA microarray; logistic discrimination; partial least squares; principal components analysis;
D O I
10.1016/j.csda.2003.08.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In microarray tumor tissue classification studies, the expressions of thousands of genes (variables) are simultaneously measured across a few tissue samples. Standard statistical methodologies in classification do not work well when the dimension, p, is greater than the sample size, N. One approach to classification problems, when pmuch greater thanN, is to first apply a dimension reduction method and then perform the classification in the reduced space. In this paper, we study dimension reduction for classification in high dimension based on partial least squares (PLS) and principal components analysis (PCA). In addition, we propose and explore two hybrid-PLS methods for dimension reduction. PLS components are linear combinations of the original predictors, but the weights are nonlinear functions of both the predictors and response variable. This makes it difficult to study the PLS classification methodologies analytically, so, in this paper, we turn to a numerical study using simulation. (C) 2003 Elsevier B.V. All rights reserved.
引用
收藏
页码:407 / 425
页数:19
相关论文
共 25 条
[1]  
ALBERT A, 1984, BIOMETRIKA, V71, P1
[2]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[3]   PLS regression methods [J].
Höskuldsson, Agnar .
Journal of Chemometrics, 1988, 2 (03) :211-228
[4]   Tissue classification with gene expression profiles [J].
Ben-Dor, A ;
Bruhn, L ;
Friedman, N ;
Nachman, I ;
Schummer, M ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :559-583
[5]   SIMPLS - AN ALTERNATIVE APPROACH TO PARTIAL LEAST-SQUARES REGRESSION [J].
DEJONG, S .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1993, 18 (03) :251-263
[6]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[7]   Support vector machine classification and validation of cancer tissue samples using microarray expression data [J].
Furey, TS ;
Cristianini, N ;
Duffy, N ;
Bednarski, DW ;
Schummer, M ;
Haussler, D .
BIOINFORMATICS, 2000, 16 (10) :906-914
[8]   AN INTERPRETATION OF PARTIAL LEAST-SQUARES [J].
GARTHWAITE, PH .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (425) :122-127
[10]  
Hosmer D.W., 1986, APPL LOGISTIC REGRES