Selecting significant factors by the noise addition method in principal component analysis

被引:12
作者
Dable, BK [1 ]
Booksh, KS [1 ]
机构
[1] Arizona State Univ, Dept Chem & Biochem, Tempe, AZ 85287 USA
关键词
rank determination; HPLC; FIA; Monte Carlo simulation; noise addition method;
D O I
10.1002/cem.646
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 [计算机科学与技术];
摘要
The noise addition method (NAM) is presented as a tool for determining the number of significant factors in a data set. The NAM is compared to residual standard deviation (RSD), the factor indicator function (IND), chi-squared (chi (2)) and cross-validation (CV) for establishing the number of significant factors in three data sets. The comparison and validation of the NAM are performed through Monte Carlo simulations with noise distributions of varying standard deviation, HPLC/UV-vis chromatographs of a mixture of aromatic hydrocarbons, and FIA of methyl orange. The NAM succeeds in correctly identifying the proper number of significant factors 98% of the time with the simulated data, 99% in the HPLC data sets and 98% with the FIA data. RSD and chi (2) fail to choose the proper number of factors in all three data sets. IND identifies the correct number of factors in the simulated data sets but fails with the HPLC and FIA data sets. Both CV methods fail in the HPLC and FIA data sets. CV. also fails for the simulated data sets, while the modified CV correctly chooses the proper number of factors an average of 80% of the time. Copyright (C) 2001 John Wiley & Sons, Ltd.
引用
收藏
页码:591 / 613
页数:23
相关论文
共 19 条
[1]
BOOTH JG, 1994, BIOMETRIKA, V81, P331, DOI 10.1093/biomet/81.2.331
[2]
Box GEP., 1978, Statistics for experimenters
[3]
Chen ZP, 1999, J CHEMOMETR, V13, P15, DOI 10.1002/(SICI)1099-128X(199901/02)13:1<15::AID-CEM527>3.0.CO
[4]
2-I
[5]
FAST AND ACCURATE APPROXIMATE DOUBLE BOOTSTRAP CONFIDENCE-INTERVALS [J].
DICICCIO, TJ ;
MARTIN, MA ;
YOUNG, GA .
BIOMETRIKA, 1992, 79 (02) :285-295
[6]
CROSS-VALIDATORY CHOICE OF THE NUMBER OF COMPONENTS FROM A PRINCIPAL COMPONENT ANALYSIS [J].
EASTMENT, HT ;
KRZANOWSKI, WJ .
TECHNOMETRICS, 1982, 24 (01) :73-77
[7]
An automated procedure to predict the number of components in spectroscopic data [J].
Elbergali, A ;
Nygren, J ;
Kubista, M .
ANALYTICA CHIMICA ACTA, 1999, 379 (1-2) :143-158
[8]
ASPECTS OF PSEUDORANK ESTIMATION METHODS BASED ON AN ESTIMATE OF THE SIZE OF THE MEASUREMENT ERROR [J].
FABER, NM ;
BUYDENS, LMC ;
KATEMAN, G .
ANALYTICA CHIMICA ACTA, 1994, 296 (01) :1-20
[9]
HALL P, 1988, BIOMETRIKA, V75, P661
[10]
Hays W. L., 1988, STATISTICS