A direct formulation for sparse PCA using semidefinite programming

被引:449
作者
d'Aspremont, Alexandre [1 ]
El Ghaoui, Laurent
Jordan, Michael I.
Lanckriet, Gert R. G.
机构
[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA
[2] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[3] Univ Calif Berkeley, Dept EECS & Stat, Berkeley, CA 94720 USA
[4] Univ Calif San Diego, Dept Elect & Comp Engn, La Jolla, CA 92093 USA
关键词
principal component analysis; Karhunen-Loeve transform; factor analysis; semidefinite relaxation; Moreau-Yosida regularization; semidefinite programming;
D O I
10.1137/050645506
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Given a covariance matrix, we consider the problem of maximizing the variance explained by a particular linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This problem arises in the decomposition of a covariance matrix into sparse factors or sparse principal component analysis (PCA), and has wide applications ranging from biology to finance. We use a modification of the classical variational representation of the largest eigenvalue of a symmetric matrix, where cardinality is constrained, and derive a semidefinite programming-based relaxation for our problem. We also discuss Nesterov's smooth minimization technique applied to the semidefinite program arising in the semidefinite relaxation of the sparse PCA problem. The method has complexity O(n(4) root log(n)/epsilon), where n is the size of the underlying covariance matrix and e is the desired absolute accuracy on the optimal value of the problem.
引用
收藏
页码:434 / 448
页数:15
相关论文
共 30 条