Generalisation performance of artificial neural networks for near infrared spectral analysis

被引:15
作者
Wang, W. [1 ]
Paliwal, J. [1 ]
机构
[1] Univ Manitoba, Dept Biosyst Engn, Winnipeg, MB R3T 5V6, Canada
关键词
D O I
10.1016/j.biosystemseng.2006.02.001
中图分类号
S2 [农业工程];
学科分类号
0828 ;
摘要
Generalisation performance of artificial neural networks (ANNs) is very important when a trained network analyses unseen data. It is associated with factors such as, representativeness and number of training samples, model structure and complexity, training procedures, and appropriate data representation. It is, however, difficult to set common benchmarks to assess the generalisation capabilities of different types of networks. This paper discusses the methods to improve generalisation of multi-layer perceptron (MLP) ANNs and provide experimental proof using near-infrared spectral data. Near-infrared spectra of wheat kernels infested with rice weevil (Sitophilus oryzae) at II infestation levels were collected and MLP networks were trained to quantitatively determine the insect infestation levels. Spectral data were pre-processed with principal component analysis (PCA) to reduce the input dimensionality and outliers were successfully detected using Hotelling T-2 and Q statistics. Optimal network complexity was selected by evaluating generalisation performance of neural network using the Schwarz's Bayesian criteria, Akaike's information criterion, and root mean squared errors of cross-validation (RMSECV) derived by 10-fold cross-validation. Model order assessed by RMSECV provided most economic network complexity. Stacked regression and network committees were shown to overcome the drawbacks of winner-takes-all strategy and gave prediction performance on test set with a lowest root mean squared error of prediction (RMSEP) of 3(.)5% and coefficient of determination r(2) >= 0.9. Prediction performance on average spectra had a lowest RMSEP of 1(.)2% with higher r(2) >= 0.97, and prediction performance for low infestation levels (<= 10%) had a lowest RMSEP of 0(.)4%. (c) 2006 IAgrE. All rights reserved Published by Elsevier Ltd.
引用
收藏
页码:7 / 18
页数:12
相关论文
共 33 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
[Anonymous], NEURAL SMITHING SUPE
[3]  
[Anonymous], 1961, Adaptive Control Processes: a Guided Tour, DOI DOI 10.1515/9781400874668
[4]  
[Anonymous], 1991, USERS GUIDE PRINCIPA
[5]  
[Anonymous], 2004, CAN BIOSYST ENG
[6]  
[Anonymous], 1996, NEURAL NETWORKS STAT
[7]  
[Anonymous], 1963, Amer. Math. Soc. Trans, DOI DOI 10.1090/TRANS2/028/04
[8]   DETERMINATION OF PHYSIOLOGICAL LEVELS OF GLUCOSE IN AN AQUEOUS MATRIX WITH DIGITALLY FILTERED FOURIER-TRANSFORM NEAR-INFRARED SPECTRA [J].
ARNOLD, MA ;
SMALL, GW .
ANALYTICAL CHEMISTRY, 1990, 62 (14) :1457-1464
[9]   STANDARD NORMAL VARIATE TRANSFORMATION AND DE-TRENDING OF NEAR-INFRARED DIFFUSE REFLECTANCE SPECTRA [J].
BARNES, RJ ;
DHANOA, MS ;
LISTER, SJ .
APPLIED SPECTROSCOPY, 1989, 43 (05) :772-777
[10]  
Bishop C. M., 1996, Neural networks for pattern recognition