Use of the bootstrap and permutation methods for a more robust variable importance in the projection metric for partial least squares regression

被引:74
作者
Afanador, N. L. [1 ,3 ]
Tran, T. N. [2 ]
Buydens, L. M. C. [3 ]
机构
[1] Merck Sharp & Dohme Ltd, Ctr Math Sci, West Point, PA USA
[2] Merck Sharp & Dohme Ltd, Ctr Math Sci, Oss, Netherlands
[3] Radboud Univ Nijmegen, Inst Mol & Mat, Nijmegen, Netherlands
关键词
Partial least squares; Bootstrapping; Permutation; Variable importance; SELECTION METHODS; PLS-REGRESSION; JACKKNIFE;
D O I
10.1016/j.aca.2013.01.004
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Bio-pharmaceutical manufacturing is a multifaceted and complex process wherein the manufacture of a single batch hundreds of processing variables and raw materials are monitored. In these processes, identifying the candidate variables responsible for any changes in process performance can prove to be extremely challenging. Within this context, partial least squares (PLS) has proven to be an important tool in helping determine the root cause for changes in biological performance, such as cellular growth or viral propagation. In spite of the positive impact PLS has had in helping understand bio-pharmaceutical process data, the high variability in measured response (Y) and predictor variables (X), and weak relationship between X and Y, has at times made root cause determination for process changes difficult. Our goal is to demonstrate how the use of bootstrapping, in conjunction with permutation tests, can provide avenues for improving the selection of variables responsible for manufacturing process changes via the variable importance in the projection (PLS-VIP) statistic. Although applied uniquely to the PLS-VIP in this article, the generality of the aforementioned methods can be used to improve other variable selection methods, in addition to increasing confidence around other estimates obtained from a PLS model. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:49 / 56
页数:8
相关论文
共 17 条
[1]  
Alfons A., 2012, cvTools: Cross-validation tools for regression models
[2]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[3]   Performance of some variable selection methods when multicollinearity is present [J].
Chong, IG ;
Jun, CH .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 78 (1-2) :103-112
[4]   A LEISURELY LOOK AT THE BOOTSTRAP, THE JACKKNIFE, AND CROSS-VALIDATION [J].
EFRON, B ;
GONG, G .
AMERICAN STATISTICIAN, 1983, 37 (01) :36-48
[5]   THE JACKKNIFE ESTIMATE OF VARIANCE [J].
EFRON, B ;
STEIN, C .
ANNALS OF STATISTICS, 1981, 9 (03) :586-596
[6]   Comparison of selection methods of explanatory variables in PLS regression with application to manufacturing process data [J].
Gauchi, JP ;
Chagnon, P .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2001, 58 (02) :171-193
[7]   Selecting both latent and explanatory variables in the PLS1 regression model [J].
Lazraq, A ;
Cléroux, R ;
Gauchi, JP .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2003, 66 (02) :117-126
[8]  
Mevik Bjorn-Helge., 2011, PLS PARTIAL LEAST SQ
[9]  
Phatak A, 1997, J CHEMOMETR, V11, P311, DOI 10.1002/(SICI)1099-128X(199707)11:4<311::AID-CEM478>3.3.CO
[10]  
2-W