Automatic dimensionality selection from the scree plot via the use of profile likelihood

被引:247
作者
Zhu, Mu [1 ]
Ghodsi, Ali [1 ]
机构
[1] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON N2L 3G1, Canada
关键词
data compression; denoising; isomap; latent semantic indexing; manifold learning; principal component analysis (PCA); resampling methods; singular value decomposition (SVD);
D O I
10.1016/j.csda.2005.09.010
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Most dimension reduction techniques produce ordered coordinates so that only the first few coordinates need be considered in subsequent analyses. The choice of how many coordinates to use is often made with a visual heuristic, i.e., by making a scree plot and looking for a "big gap" or an "elbow." In this article, we present a simple and automatic procedure to accomplish this goal by maximizing a simple profile likelihood function. We give a wide variety of both simulated and real examples. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:918 / 930
页数:13
相关论文
共 14 条
[1]  
[Anonymous], 1979, Multivariate analysis
[2]  
Cox T., 2001, MULTIDIMENSIONAL SCA
[3]  
DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
[4]  
2-9
[5]   IMPROVING THE RETRIEVAL OF INFORMATION FROM EXTERNAL SOURCES [J].
DUMAIS, ST .
BEHAVIOR RESEARCH METHODS INSTRUMENTS & COMPUTERS, 1991, 23 (02) :229-236
[6]  
Friedman J., 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[7]  
Jolliffe I.T., 2002, PRINCIPAL COMPONENTS
[8]  
McCullagh P., 2018, Generalized Linear Models
[9]   Augmenting naive Bayes classifiers with statistical language models [J].
Peng, FC ;
Schuurmans, D ;
Wang, SJ .
INFORMATION RETRIEVAL, 2004, 7 (3-4) :317-345
[10]   Nonlinear dimensionality reduction by locally linear embedding [J].
Roweis, ST ;
Saul, LK .
SCIENCE, 2000, 290 (5500) :2323-+