Variable selection in discriminant partial least-squares analysis

被引:99
作者
Alsberg, BK [1 ]
Kell, DB [1 ]
Goodacre, R [1 ]
机构
[1] Univ Coll Aberystwyth, Inst Biol Sci, Aberystwyth SY23 3DD, Dyfed, Wales
关键词
D O I
10.1021/ac980506o
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Variable selection enhances the understanding and interpretability of multivariate classification models. A new chemometric method based on the selection of the most important variables in discriminant partial least-squares (VS-DPLS) analysis is described. The suggested method is a simple extension of DPLS where a small number of elements in the weight vector w is retained for each factor, The optimal number of DPLS factors is determined by cross-validation. The new algorithm is applied to four different high-dimensional spectral data sets with excellent results. Spectral profiles from Fourier transform infrared spectroscopy and pyrolysis mass spectrometry are used. To investigate the uniqueness of the selected variables an iterative VS-DPLS procedure is performed, At each iteration, the previously found selected variables are removed to see if a new VS-DPLS classification model can be constructed using a different set of variables. In this manner, it is possible to determine regions rather than individual variables that are important for a successful classification.
引用
收藏
页码:4126 / 4133
页数:8
相关论文
共 57 条
[11]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[12]   MODEL UNCERTAINTY, DATA MINING AND STATISTICAL-INFERENCE [J].
CHATFIELD, C .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1995, 158 :419-466
[13]   ALTERNATIVE KAPPA-NEAREST NEIGHBOR RULES IN SUPERVISED PATTERN-RECOGNITION .1. KAPPA-NEAREST NEIGHBOR CLASSIFICATION BY USING ALTERNATIVE VOTING RULES [J].
COOMANS, D ;
MASSART, DL .
ANALYTICA CHIMICA ACTA, 1982, 136 (APR) :15-27
[14]   PREPROCESSING, VARIABLE SELECTION, AND CLASSIFICATION RULES IN THE APPLICATION OF SIMCA PATTERN-RECOGNITION TO MASS-SPECTRAL DATA [J].
DUNN, WJ ;
EMERY, SL ;
GLEN, WG ;
SCOTT, DR .
ENVIRONMENTAL SCIENCE & TECHNOLOGY, 1989, 23 (12) :1499-1505
[15]  
Friedman JH., 1984, BIOMETRICS, V40, P874, DOI [DOI 10.2307/2530946, 10.2307/2530946]
[16]   Genetic programming: A novel method for the quantitative analysis of pyrolysis mass spectral data [J].
Gilbert, RJ ;
Goodacre, R ;
Woodward, AM ;
Kell, DB .
ANALYTICAL CHEMISTRY, 1997, 69 (21) :4381-4389
[17]   POSSIBILITIES AND LIMITS OF AN ONLINE COUPLING OF THIN-LAYER CHROMATOGRAPHY AND FTIR-SPECTROSCOPY [J].
GLAUNINGER, G ;
KOVAR, KA ;
HOFFMANN, V .
FRESENIUS JOURNAL OF ANALYTICAL CHEMISTRY, 1990, 338 (06) :710-716
[18]   Use of pyrolysis mass spectrometry with supervised learning for the assessment of the adulteration of milk of different species [J].
Goodacre, R .
APPLIED SPECTROSCOPY, 1997, 51 (08) :1144-1153
[19]   RAPID IDENTIFICATION USING PYROLYSIS MASS-SPECTROMETRY AND ARTIFICIAL NEURAL NETWORKS OF PROPIONIBACTERIUM-ACNES ISOLATED FROM DOGS [J].
GOODACRE, R ;
NEAL, MJ ;
KELL, DB ;
GREENHAM, LW ;
NOBLE, WC ;
HARVEY, RG .
JOURNAL OF APPLIED BACTERIOLOGY, 1994, 76 (02) :124-134
[20]   Pyrolysis mass spectrometry and its applications in biotechnology [J].
Goodacre, R ;
Kell, DB .
CURRENT OPINION IN BIOTECHNOLOGY, 1996, 7 (01) :20-28