Three-way (PARAFAC) factor analysis: examination and comparison of alternative computational methods as applied to ill-conditioned data

被引:56
作者
Hopke, PK [1 ]
Paatero, P
Jia, H
Ross, RT
Harshman, RA
机构
[1] Clarkson Univ, Dept Chem, Potsdam, NY 13699 USA
[2] Univ Helsinki, Dept Phys, FIN-00014 Helsinki, Finland
[3] Clarkson Univ, Dept Civil & Environm Engn, Potsdam, NY 13699 USA
[4] Ohio State Univ, Dept Biochem, Columbus, OH 43210 USA
[5] Univ Western Ontario, Dept Psychol, London, ON N6A 5C2, Canada
关键词
factor analysis; trilinear; PARAFAC; fluorescence spectroscopy; PMF3; TPALS; DTDMR;
D O I
10.1016/S0169-7439(98)00077-X
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Four different approaches to solving the trilinear three-way factor analysis problem are compared, and their performance with 'difficult' (i.e., ill-conditioned) data is tested. These approaches ape represented by four different computer programs: one using a simple alternating least squares (ALS) algorithm with only minimal extrapolation (HL-PARAFAC), one in which the ALS is supplemented by a sophisticated extrapolation to speed convergence (TPALS), one using a non-linear curve fitting method (PMF3), and one using a non-iterative closed-form approximation (DTDMR). The options provided by these programs (e.g., with regard to missing values, weighted least squares, non-negativity and other constraints) are compared, Criteria for choosing synthesized test data and a method for synthesizing exponential test data are described. A numerical index is introduced to characterize the ill-conditioning of n-way arrays (n > 2). Two well characterized synthetic data sets serve as 'difficult' till-conditioned) test data. Intercomparisons among HL-PARAFAC, TPALS, DTDMR and PMF3 were implemented with these test data. Consequently, their limitations and strengths are determined, In addition, these trilinear analysis approaches are applied to a difficult set of ill-conditioned real data: a set of fluorescence spectroscopy measurements that characterize the steady-state fluorescence of an amino acid in aqueous solution. When converged, the results produced by the three least-squares techniques (but not DTDMR) agree. However, there are large differences in convergence speed when these difficult problems are solved: TPALS is faster than PARAFAC by a factor of ten, and PMF3 is faster than TPALS, again by a factor of ten. The program DTDMR is the fastest, bur it only solves half of the problems. (C) 1998 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:25 / 42
页数:18
相关论文
共 23 条
[1]  
[Anonymous], 1990, Journal ofChemometrics, DOI DOI 10.1002/CEM.1180040105
[2]  
Bro R, 1997, J CHEMOMETR, V11, P393, DOI 10.1002/(SICI)1099-128X(199709/10)11:5<393::AID-CEM483>3.0.CO
[3]  
2-L
[4]   ANALYSIS OF INDIVIDUAL DIFFERENCES IN MULTIDIMENSIONAL SCALING VIA AN N-WAY GENERALIZATION OF ECKART-YOUNG DECOMPOSITION [J].
CARROLL, JD ;
CHANG, JJ .
PSYCHOMETRIKA, 1970, 35 (03) :283-&
[5]  
Harshman R. A., 1984, DATA PREPROCESSING E, P216
[6]   PARAFAC - PARALLEL FACTOR-ANALYSIS [J].
HARSHMAN, RA ;
LUNDY, ME .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1994, 18 (01) :39-72
[7]  
Harshman RA, 1970, FDN PARAFAC PROCEDUR, DOI DOI 10.1134/S0036023613040165
[8]  
HARSHMAN RA, 1984, 3 WAY FACTOR ANAL MU, P122
[9]  
Kowalski, 1989, J CHEMOMETR, V3, P493
[10]   3-WAY ARRAYS - RANK AND UNIQUENESS OF TRILINEAR DECOMPOSITIONS, WITH APPLICATION TO ARITHMETIC COMPLEXITY AND STATISTICS [J].
KRUSKAL, JB .
LINEAR ALGEBRA AND ITS APPLICATIONS, 1977, 18 (02) :95-138