PROJECTION PURSUIT EXPLORATORY DATA-ANALYSIS

被引:39
作者
POSSE, C [1 ]
机构
[1] STANFORD UNIV,DEPT STAT,STANFORD,CA 94305
关键词
CLUSTER ANALYSIS; EXPLORATORY DATA ANALYSIS; GLOBAL OPTIMIZATION; INVARIANT CHI-SQUARED TEST STATISTICS; P-VALUES FOR PROJECTIONS; PROJECTION PURSUIT; 2-DIMENSIONAL PROJECTIONS;
D O I
10.1016/0167-9473(95)00002-8
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Posse (1990) presented a projection pursuit technique, based on a global optimization algorithm and on a chi-squared projection index, for finding the plane in which the data are the most interesting. This paper extends and improves this algorithm providing an exploratory data analysis by projection pursuit that has important advantages over its competitors. The global optimization algorithm, when combined with a structure removal procedure due to Friedman (1987), allows a sequential identification of interesting bidimensional views of decreasing importance. The modified chi-squared index satisfies the five basic demands for a projection index. It is (1) uniquely minimized at the bivariate normal distribution, (2) approximately affine invariant, (3) consistent, (4) resistant to features in the tail of the distribution and, (5) simple enough to permit quick computation even for large data sets. The paper gives simple rules for judging the significance of a structure found by this algorithm. These rules define a stopping criterion for the search process. They are based on theoretical (asymptotic) arguments and are well-supported by simulations. The efficacy of the new algorithm is illustrated through several studies of real and simulated data.
引用
收藏
页码:669 / 687
页数:19
相关论文
共 25 条