PARALLEL ANALYSIS - A METHOD FOR DETERMINING SIGNIFICANT PRINCIPAL COMPONENTS

被引:273
作者
FRANKLIN, SB
GIBSON, DJ
ROBERTSON, PA
POHLMANN, JT
FRALISH, JS
机构
[1] Department of Plant Biology, Southern Illinois University, Carbondale, Illinois
[2] Education Psychology and Special Education Department, Southern Illinois University, Carbondale, Illinois
[3] Department of Forestry, Southern Illinois University, Carbondale, Illinois
关键词
LITERATURE RESEARCH; OVEREXTRACTION; PRINCIPAL COMPONENTS ANALYSIS; SPURIOUS COMPONENT;
D O I
10.2307/3236261
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Numerous ecological studies use Principal Components Analysis (PCA) for exploratory analysis and data reduction. Determination of the number of components to retain is the most crucial problem confronting the researcher when using PCA. An incorrect choice may lead to the underextraction of components, but commonly results in overextraction. Of several methods proposed to determine the significance of principal components, Parallel Analysis (PA) has proven consistently accurate in determining the threshold for significant components, variable loadings, and analytical statistics when decomposing a correlation matrix. In this procedure, eigenvalues from a data set prior to rotation are compared with those from a matrix of random values of the same dimensionality (p variables and n samples). PCA eigenvalues from the data greater than PA eigenvalues from the corresponding random data can be retained. All components with eigenvalues below this threshold value should be considered spurious. We illustrate Parallel Analysis on an environmental data set. We reviewed all articles utilizing PCA or Factor Analysis (FA) from 1987 to 1993 from Ecology, Ecological Monographs, Journal of Vegetation Science and Journal of Ecology. Analyses were first separated into those PCA which decomposed a correlation matrix and those PCA which decomposed a covariance matrix. Parallel Analysis (PA) was applied for each PCA/FA found in the literature. Of 39 analyses (in 22 articles), 29 (74.4%) considered no threshold rule, presumably retaining interpretable components. According to the PA results, 26 (66.7%) overextracted components. This overextraction may have resulted in potentially misleading interpretation of spurious components. It is suggested that the routine use of PA in multivariate ordination will increase confidence in the results and reduce the subjective interpretation of supposedly objective methods.
引用
收藏
页码:99 / 106
页数:8
相关论文
共 62 条
[1]  
SPSS‐X user's guide, (1988)
[2]  
SAS/STAT user's guide, version 6, 2, (1990)
[3]  
Abdel-Razik M.S., Ismail A.M.A., Vegetation composition of a maritime salt marsh in Qatar in relation to edaphic features, Journal of Vegetation Science, 1, pp. 85-88, (1990)
[4]  
Allen S.J., Hubbard R., Regression equations for the latent roots of random data correlation matrices with unities on the diagonal, Multivariate Behavioral Research, 21, pp. 393-398, (1986)
[5]  
Anderson T.W., Asymptotic theory for principal component analysis, The Annals of Mathematical Statistics, 34, pp. 122-148, (1963)
[6]  
Austin M.P., Models for the analysis of species' response to environmental gradients, Vegetatio, 69, pp. 35-45, (1987)
[7]  
Bartlett M.S., Tests of significance in factor analysis, Br. J. Psychol., 3, pp. 77-85, (1950)
[8]  
Bartlett M.S., A further note on tests of significance in factor analysis, Br. J. Psychol., 4, pp. 1-2, (1951)
[9]  
Biondini M.E., Mielke P.W., Redente E.F., Permutation techniques based on euclidean analysis spaces: a new and powerful statistical method for ecological research, Coenoses, 3, pp. 155-174, (1988)
[10]  
Blinn D.W., Diatom community structure along physicochemical gradients in saline lakes, Ecology, 74, pp. 1246-1263, (1993)