Robust methods for partial least squares regression

被引:212
作者
Hubert, M [1 ]
Vanden Branden, K [1 ]
机构
[1] Katholieke Univ Leuven, Dept Math, B-3001 Louvain, Belgium
关键词
partial least squares regression; SIMPLS; principal component analysis; robust regression;
D O I
10.1002/cem.822
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Partial least squares regression (PLSR) is a linear regression technique developed to deal with high-dimensional regressors and one or several response variables. In this paper we introduce robustified versions of the SIMPLS algorithm, this being the leading PLSR algorithm because of its speed and efficiency. Because SIMPLS is based on the empirical cross-covariance matrix between the response variables and the regressors and on linear least squares regression, the results are affected by abnormal observations in the data set. Two robust methods, RSIMCD and RSIMPLS, are constructed from a robust covariance matrix for high-dimensional data and robust linear regression. We introduce robust RMSECV and RMSEP values for model calibration and model validation. Diagnostic plots are constructed to visualize and classify the outliers. Several simulation results and the analysis of real data sets show the effectiveness and robustness of the new approaches. Because RSIMPLS is roughly twice as fast as RSIMCD, it stands out as the overall best method. Copyright (C) 2003 John Wiley Sons, Ltd.
引用
收藏
页码:537 / 549
页数:13
相关论文
共 29 条
[1]  
[Anonymous], 1999, APPL MULTIVARIATE AN
[2]   ITERATIVELY REWEIGHTED PARTIAL LEAST-SQUARES - A PERFORMANCE ANALYSIS BY MONTE-CARLO SIMULATION [J].
CUMMINS, DJ ;
ANDREWS, CW .
JOURNAL OF CHEMOMETRICS, 1995, 9 (06) :489-507
[3]   SIMPLS - AN ALTERNATIVE APPROACH TO PARTIAL LEAST-SQUARES REGRESSION [J].
DEJONG, S .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1993, 18 (03) :251-263
[4]  
DONOHO DL, 1982, THESIS HARVARD U CAM
[5]  
ENGELEN S, SERIES STAT IND TECH
[6]  
Gil JA, 1998, J CHEMOMETR, V12, P365, DOI 10.1002/(SICI)1099-128X(199811/12)12:6<365::AID-CEM519>3.0.CO
[7]  
2-G
[8]   Double-case diagnostic for outliers identification [J].
Hardy, AJ ;
MacLaurin, P ;
Haswell, SJ ;
deJong, S ;
Vandeginste, BGM .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1996, 34 (01) :117-129
[9]   A fast method for robust principal components with applications to chemometrics [J].
Hubert, M ;
Rousseeuw, PJ ;
Verboven, S .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 60 (1-2) :101-111
[10]   A robust PCR method for high-dimensional regressors [J].
Hubert, M ;
Verboven, S .
JOURNAL OF CHEMOMETRICS, 2003, 17 (8-9) :438-452