How to avoid over-fitting in multivariate calibration -: The conventional validation approach and an alternative

被引:191
作者
Faber, N. M.
Rajko, R.
机构
[1] Chemometry Consultancy, NL-6717 VD Ede, Netherlands
[2] Univ Szeged, Szeged Coll Food Engn, Dept Unit Operat & Food Engn, H-6701 Szeged, Hungary
基金
匈牙利科学研究基金会;
关键词
multivariate calibration; PLS; component selection; cross-validation; test set validation; randomization test; near-infrared spectroscopy;
D O I
10.1016/j.aca.2007.05.030
中图分类号
O65 [分析化学];
学科分类号
070302 [分析化学]; 081704 [应用化学];
摘要
This paper critically reviews the problem of over-fitting in multivariate calibration and the conventional validation-based approach to avoid it. It proposes a randomization test that enables one to assess the statistical significance of each component that enters the model. This alternative is compared with cross-validation and independent test set validation for the calibration of a near-infrared spectral data set using partial least squares (PLS) regression. The results indicate that the alternative approach is more objective, since, unlike the validation-based approach, it does not require the use of 'soft' decision rules. The alternative approach therefore appears to be a useful addition to the chemometrician's toolbox. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:98 / 106
页数:9
相关论文
共 28 条
[1]
CHHIKARA RS, 1989, INVERSE GASUSSIAN DI
[2]
DAVIES AN, 2004, SPECTROSC EUR, V16, P26
[3]
Denham MC, 2000, J CHEMOMETR, V14, P351, DOI 10.1002/1099-128X(200007/08)14:4<351::AID-CEM598>3.0.CO
[4]
2-Q
[5]
Genetic algorithms applied to the selection of factors in principal component regression [J].
Depczynski, U ;
Frost, VJ ;
Molt, K .
ANALYTICA CHIMICA ACTA, 2000, 420 (02) :217-227
[6]
EXAMINATION OF SOME MISCONCEPTIONS ABOUT NEAR-INFRARED ANALYSIS [J].
DIFOGGIO, R .
APPLIED SPECTROSCOPY, 1995, 49 (01) :67-75
[8]
Selecting the optimum number of partial least squares components for the calibration of attenuated total reflectance-mid-infrared spectra of undesigned kerosene samples [J].
Gomez-Carracedo, M. P. ;
Andrade, J. M. ;
Rutledge, D. N. ;
Faber, N. M. .
ANALYTICA CHIMICA ACTA, 2007, 585 (02) :253-265
[9]
EVOLVING FACTOR-ANALYSIS IN THE PRESENCE OF HETEROSCEDASTIC NOISE [J].
KELLER, HR ;
MASSART, DL ;
LIANG, YZ ;
KVALHEIM, OM .
ANALYTICA CHIMICA ACTA, 1992, 263 (1-2) :29-36
[10]
ALTERNATIVES TO CROSS-VALIDATORY ESTIMATION OF THE NUMBER OF FACTORS IN MULTIVARIATE CALIBRATION [J].
LORBER, A ;
KOWALSKI, BR .
APPLIED SPECTROSCOPY, 1990, 44 (09) :1464-1470