SPARSE PRINCIPAL COMPONENT ANALYSIS AND ITERATIVE THRESHOLDING

被引:190
作者
Ma, Zongming [1 ]
机构
[1] Univ Penn, Wharton Sch, Dept Stat, Philadelphia, PA 19104 USA
基金
美国国家科学基金会;
关键词
Dimension reduction; high-dimensional statistics; principal component analysis; principal subspace; sparsity; spiked covariance model; thresholding; CONSISTENCY; ASYMPTOTICS;
D O I
10.1214/13-AOS1097
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features p is comparable to, or even much larger than, the sample size n. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.
引用
收藏
页码:772 / 801
页数:30
相关论文
共 36 条
[1]   HIGH-DIMENSIONAL ANALYSIS OF SEMIDEFINITE RELAXATIONS FOR SPARSE PRINCIPAL COMPONENTS [J].
Amini, Arash A. ;
Wainwright, Martin J. .
ANNALS OF STATISTICS, 2009, 37 (5B) :2877-2921
[2]   ASYMPTOTIC THEORY FOR PRINCIPAL COMPONENT ANALYSIS [J].
ANDERSON, TW .
ANNALS OF MATHEMATICAL STATISTICS, 1963, 34 (01) :122-&
[3]  
[Anonymous], 2009, APPL SPECTROSC, DOI DOI 10.1366/000370210791114185
[4]  
[Anonymous], 1990, Matrix perturbation theory, Computer Science and Scientific Computing
[5]  
[Anonymous], 2009, WAVELET TOUR SIGNAL
[6]   A direct formulation for sparse PCA using semidefinite programming [J].
d'Aspremont, Alexandre ;
El Ghaoui, Laurent ;
Jordan, Michael I. ;
Lanckriet, Gert R. G. .
SIAM REVIEW, 2007, 49 (03) :434-448
[7]   ROTATION OF EIGENVECTORS BY A PERTURBATION .3. [J].
DAVIS, C ;
KAHAN, WM .
SIAM JOURNAL ON NUMERICAL ANALYSIS, 1970, 7 (01) :1-&
[8]  
Donoho D. L., 1993, Applied and Computational Harmonic Analysis, V1, P100, DOI 10.1006/acha.1993.1008
[9]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[10]  
Golub G. H., 1996, MATRIX COMPUTATIONS