Principal component analysis versus fuzzy principal component analysis - A case study: the quality of danube water (1985-1996)

被引:141
作者
Sarbu, C
Pop, HF
机构
[1] Univ Babes Bolyai, Fac Chem & Chem Engn, Dept Analyt Chem, RO-400028 Cluj Napoca, Romania
[2] Univ Babes Bolyai, Dept Informat, RO-400028 Cluj Napoca, Romania
关键词
water quality; principal component analysis; fuzzy principal component analysis;
D O I
10.1016/j.talanta.2004.08.047
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Principal component analysis (PCA) is a favorite tool in environmetrics for data compression and information extraction. PCA finds linear combinations of the original measurement variables that describe the significant variations in the data. However, it is well-known that PCA, as with any other multivariate statistical method, is sensitive to outliers, missing data, and poor linear correlation between variables due to poorly distributed variables. As a result data transformations have a large impact upon PCA. In this regard one of the most powerful approach to improve PCA appears to be the fuzzification of the matrix data, thus diminishing the influence of the outliers. In this paper we discuss and apply a robust fuzzy PCA algorithm (FPCA). The efficiency of the new algorithm is illustrated on a data set concerning the water quality of the Danube River for a period of 11 consecutive years. Considering, for example, a two component model, FPCA accounts for 91.7% of the total variance and PCA accounts only for 39.8%. Much more, PCA showed only a partial separation of the variables and no separation of scores (samples) onto the plane described by the first two principal components, whereas a much sharper differentiation of the variables and.;cores is observed when FPCA is applied. (c) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:1215 / 1220
页数:6
相关论文
共 29 条
[1]  
[Anonymous], 1987, ROBUST REGRESSION OU
[2]   FCM - THE FUZZY C-MEANS CLUSTERING-ALGORITHM [J].
BEZDEK, JC ;
EHRLICH, R ;
FULL, W .
COMPUTERS & GEOSCIENCES, 1984, 10 (2-3) :191-203
[3]  
Brereton R. G., 1990, Chemometrics-applications of mathematics and statistics to laboratory systems
[4]  
CUNDARI TR, 2000, J CHEM INF COMP SCI, P40
[6]   A FUZZY DIVISIVE HIERARCHICAL-CLUSTERING ALGORITHM FOR THE OPTIMAL CHOICE OF SETS OF SOLVENT SYSTEMS [J].
DUMITRESCU, D ;
SARBU, C ;
POP, H .
ANALYTICAL LETTERS, 1994, 27 (05) :1031-1054
[7]   Treatment of nondetects in multivariate analysis of groundwater geochemistry data [J].
Farnham, IM ;
Singh, AK ;
Stetzenbach, KJ ;
Johannesson, KH .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 60 (1-2) :265-281
[8]  
Geiss S., 1997, CHEMOMETRICS ENV ANA
[9]   A fast method for robust principal components with applications to chemometrics [J].
Hubert, M ;
Rousseeuw, PJ ;
Verboven, S .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 60 (1-2) :101-111
[10]   Robust PCA and classification in biosciences [J].
Hubert, M ;
Engelen, S .
BIOINFORMATICS, 2004, 20 (11) :1728-1736