PoLiSh - smoothed partial least-squares regression

被引:16
作者
Rutledge, DN
Barros, A
Delgadillo, I
机构
[1] Inst Natl Agron Paris Grignon, Chim Analyt Lab, F-75005 Paris, France
[2] Univ Aveiro, Dept Quim, P-3800 Aveiro, Portugal
关键词
partial least-squares regression; Savitsky-Golay smoothing; dimensionality; Durbin-Watson;
D O I
10.1016/S0003-2670(01)01269-7
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Partial least-squares (PLS) regression is a very widely used technique in spectroscopy for calibration/prediction purposes. One of the most important steps in the application of the PLS regression is the determination of the correct number of dimensions to use in order to avoid over-fitting, and therefore to obtain a robust predictive model. The "structured" nature of spectroscopic signals may be used in several ways as a guide to improve the PLS models. The aim of this work is to propose a new technique for the application of PLS regression to signals (FT-IR, NMR, etc.). This technique is based on the Savitsky-Golay (SG) smoothing of the loadings weights vectors (w) obtained at each iteration step of the NIPALS procedure. This smoothing progressively "displaces" the random or quasi-random variations from earlier (most important) to later (less important) PLS latent variables. The Durbin-Watson (DW) criterion is calculated for each PLS vectors (p, w, b) at each iteration step of the smoothed NIPALS procedure in order to measure the evolution of their "noise" content. PoLiSh has been applied to simulated datasets with different noise levels and it was found that for those with noise levels higher than 10-20%, an improvement in the predictive ability of the models is observed. This technique is also important as a tool to evaluate the true dimensionality of signal matrices for complex PLS models, by comparing the DW profile of the PoLiSh vectors at different smoothing degrees with those of the unsmoothed PLS models. (C) 2001 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:281 / 296
页数:16
相关论文
共 13 条
[1]   Genetic algorithm applied to the selection of principal components [J].
Barros, AS ;
Rutledge, DN .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1998, 40 (01) :65-81
[2]  
DURBIN J, 1950, BIOMETRIKA, V37, P409, DOI 10.1093/biomet/37.3-4.409
[3]  
GELADI P, 1986, ANAL CHIM ACTA, V198, P1
[4]   GENERAL LEAST-SQUARES SMOOTHING AND DIFFERENTIATION BY THE CONVOLUTION (SAVITZKY-GOLAY) METHOD [J].
GORRY, PA .
ANALYTICAL CHEMISTRY, 1990, 62 (06) :570-573
[5]  
MARDIA KV, 1994, MULTIVARIATE ANAL, P482
[6]  
Marshall A.G., 1990, Fourier Transforms in NMR, Optical, and Mass Spectrometry
[7]  
Rutledge DN, 1997, MAGN RESON CHEM, V35, pS13, DOI 10.1002/(SICI)1097-458X(199712)35:13<S13::AID-OMR199>3.0.CO
[8]  
2-P
[9]   Method for detecting information in signals: application to two-dimensional time domain NMR data [J].
Rutledge, DN ;
Barros, AS .
ANALYST, 1998, 123 (04) :551-559
[10]   Analysis of Time Domain NMR and other signals [J].
Rutledge, DN ;
Barros, AS ;
Vackier, MC ;
Baumberger, S ;
Lapierre, C .
ADVANCES IN MAGNETIC RESONANCE IN FOOD SCIENCE, 1999, (231) :203-216