AN INTERPRETATION OF PARTIAL LEAST-SQUARES

被引:341
作者
GARTHWAITE, PH
机构
关键词
BIASED REGRESSION; DATA REDUCTION; PREDICTION; REGRESSOR CONSTRUCTION;
D O I
10.2307/2291207
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Univariate partial least squares (PLS) is a method of modeling relationships between a Y variable and other explanatory variables. It may be used with any number of explanatory variables, even far more than the number of observations. A simple interpretation is given that shows the method to be a straightforward and reasonable way of forming prediction equations. Its relationship to multivariate PLS, in which there are two or more Y variables, is examined, and an example is given in which it is compared by simulation with other methods of forming prediction equations. With univariate PLS, linear combinations of the explanatory variables are formed sequentially and related to Y by ordinary least, squares regression. It is shown that these linear combinations, here called components, may be viewed as weighted averages of predictors. where each predictor holds the residual information in an explanatory variable that is not contained in earlier components, and the quantity to be predicted is the vector of residuals from regressing Y against earlier components. A similar strategy is shown to underlie multivariate PLS, except that the quantity to be predicted is a weighted average of the residuals from separately regressing each Y variable against earlier components. This clarifies the differences between univariate and multivariate PLS. and it is argued that in most situations, the univariate method is likely to give the better prediction equations. In the example using simulation, univariate PLS is compared with four other methods of forming prediction equations: ordinary least squares, forward variable selection, principal components regression, and a Stein shrinkage method. Results suggest that PLS is a useful method for forming prediction equations when there are a large number of explanatory variables, particularly when the random error variance is large.
引用
收藏
页码:122 / 127
页数:6
相关论文
共 9 条
[1]  
COPAS JB, 1983, J R STAT SOC B, V45, P311
[2]   BIASED ESTIMATION IN REGRESSION - EVALUATION USING MEAN SQUARED ERROR [J].
GUNST, RF ;
MASON, RL .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1977, 72 (359) :616-628
[4]  
HELLAND IS, 1990, SCAND J STAT, V17, P97
[5]  
Hoskuldsson A., 1988, J CHEMOMETR, V2, P211, DOI DOI 10.1002/CEM.1180020306
[6]   A MULTIVARIATE CALIBRATION-PROBLEM IN ANALYTICAL-CHEMISTRY SOLVED BY PARTIAL LEAST-SQUARES MODELS IN LATENT-VARIABLES [J].
SJOSTROM, M ;
WOLD, S ;
LINDBERG, W ;
PERSSON, JA ;
MARTENS, H .
ANALYTICA CHIMICA ACTA, 1983, 150 (01) :61-70
[7]  
STONE M, 1990, J ROY STAT SOC B MET, V52, P237
[8]   LATENT ROOT REGRESSION-ANALYSIS [J].
WEBSTER, JT ;
GUNST, RF ;
MASON, RL .
TECHNOMETRICS, 1974, 16 (04) :513-522
[9]   THE COLLINEARITY PROBLEM IN LINEAR-REGRESSION - THE PARTIAL LEAST-SQUARES (PLS) APPROACH TO GENERALIZED INVERSES [J].
WOLD, S ;
RUHE, A ;
WOLD, H ;
DUNN, WJ .
SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1984, 5 (03) :735-743