Data augmentation: an alternative approach to the analysis of spectroscopic data

被引:38
作者
Conlin, AK [1 ]
Martin, EB [1 ]
Morris, AJ [1 ]
机构
[1] Univ Newcastle Upon Tyne, Ctr Proc Anal Chemometr & Control, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
关键词
data augmentation; partial least squares; Gaussian noise;
D O I
10.1016/S0169-7439(98)00071-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The need for inferential models capable of accurately predicting product qualities has never been greater than it is today in the chemical process industry. However, due to production Limitations and the need to reduce costs, obtaining sufficient relevant data to enable accurate and robust calibration models to be derived is a major challenge. This is due to the intrinsic sparsity of the process data resulting from the small number of objects available, e.g., batches, compared with the large number of process variables (wavelengths) measured. This paper examines a method of applying Partial Least Squares (PLS) to a database, which has been enhanced through the addition of Gaussian noise to the original data, for the development of a robust calibration model. The addition of Gaussian noise to the process variables alone has been shown to lead to a decrease in the error of the predictor as a consequence of the increase in the data density. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:161 / 173
页数:13
相关论文
共 8 条
[1]  
Breiman L, 1996, MACH LEARN, V24, P49
[2]   Bagging predictors [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (02) :123-140
[3]  
CONLIN AK, 1996, THESIS NEWCASTLE U
[4]  
CONLIN AK, 1998, P 5 IFAC S DYN CONTR, P365
[5]  
Efron B., 1994, INTRO BOOTSTRAP, V57, DOI DOI 10.1201/9780429246593
[6]   NEURAL NETWORKS AND THE BIAS VARIANCE DILEMMA [J].
GEMAN, S ;
BIENENSTOCK, E ;
DOURSAT, R .
NEURAL COMPUTATION, 1992, 4 (01) :1-58
[7]  
Raviv Y., 1996, Connection Science, V8, P355, DOI 10.1080/095400996116811
[8]   STACKED GENERALIZATION [J].
WOLPERT, DH .
NEURAL NETWORKS, 1992, 5 (02) :241-259