Classification of GC-MS measurements of wines by combining data dimension reduction and variable selection techniques

被引:55
作者
Ballabio, Davide [1 ]
Skov, Thomas [2 ]
Leardi, Riccardo [3 ]
Bro, Rasmus [2 ]
机构
[1] Univ Milano Bicocca, Milano Chemometr & QSAR Res Grp, Dept Environm Sci, I-20126 Milan, Italy
[2] Univ Copenhagen, Dept Food Sci, Fac Life Sci, DK-1958 Frederiksberg C, Denmark
[3] Univ Genoa, Dept Pharmaceut & Food Chem & Technol, I-16147 Genoa, Italy
关键词
classification; variable selection; data reduction; wine; GC-MS;
D O I
10.1002/cem.1173
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Different classification methods (Partial Least Squares Discriminant Analysis, Extended Canonical Variates Analysis and Linear Discriminant Analysis), in combination with variable selection approaches (Forward Selection and Genetic Algorithms), were compared, evaluating their capabilities in the geographical discrimination of wine samples. Sixty-two samples were analysed by means of dynamic headspace gas chromatography mass spectrometry (HS-GC-MS) and the entire chromatographic profile was considered to build the dataset. Since variable selection techniques pose a risk of overfitting when a large number of variables is used, a method for coupling data dimension reduction and variable selection was proposed. This approach compresses windows of the original data by retaining only significant components of local Principal Component Analysis models. The subsequent variable selection is then performed on these locally derived score variables. The results confirmed that the classification models achieved on the reduced data were better than those obtained on the entire chromatographic profile, with the exception of Extended Canonical Variates Analysis, which gave acceptable models in both cases. Copyright (C) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:457 / 463
页数:7
相关论文
共 28 条
[1]  
[Anonymous], [No title captured]
[2]   Partial least squares for discrimination [J].
Barker, M ;
Rayens, W .
JOURNAL OF CHEMOMETRICS, 2003, 17 (03) :166-173
[3]  
COOMANS D, 1981, ANAL CHIM ACTA-COMP, V5, P241
[4]   UNEQ - A DISJOINT MODELING TECHNIQUE FOR PATTERN-RECOGNITION BASED ON NORMAL-DISTRIBUTION [J].
DERDE, MP ;
MASSART, DL .
ANALYTICA CHIMICA ACTA, 1986, 184 :33-51
[5]  
FRANK I E, 1989, Journal of Chemometrics, V3, P463, DOI 10.1002/cem.1180030304
[6]   DASCO - A NEW CLASSIFICATION METHOD [J].
FRANK, IE .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1988, 4 (03) :215-222
[7]  
Frank IE., 1994, The data analysis handbook, DOI 10.1016/S0922-3487(08)70048-0
[8]  
Hand D.J., 1981, DISCRIMINATION CLASS
[9]  
James M., 1985, CLASSIFICATION ALGOR
[10]   COMPUTER AIDED DESIGN OF EXPERIMENTS [J].
KENNARD, RW ;
STONE, LA .
TECHNOMETRICS, 1969, 11 (01) :137-&