NEARLY UNBIASED VARIABLE SELECTION UNDER MINIMAX CONCAVE PENALTY

被引:2733
作者
Zhang, Cun-Hui [1 ]
机构
[1] Rutgers State Univ, Dept Stat & Biostat, Piscataway, NJ 08854 USA
基金
美国国家科学基金会;
关键词
Variable selection; model selection; penalized estimation; least squares; correct selection; minimax; unbiasedness; mean squared error; nonconvex minimization; risk estimation; degrees of freedom; selection consistency; sign consistency; NONCONCAVE PENALIZED LIKELIHOOD; STATISTICAL ESTIMATION; DANTZIG SELECTOR; ADAPTIVE LASSO; REGRESSION; SPARSITY; LARGER; RISK;
D O I
10.1214/09-AOS729
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose MC+, a fast, continuous, nearly unbiased and accurate method of penalized variable selection in high-dimensional linear regression. The LASSO is fast and continuous, but biased. The bias of the LASSO may prevent consistent variable selection. Subset selection is unbiased but computationally costly. The MC+ has two elements: a minimax concave penalty (MCP) and a penalized linear unbiased selection (PLUS) algorithm. The MCP provides the convexity of the penalized loss in sparse regions to the greatest extent given certain thresholds for variable selection and unbiasedness. The PLUS computes multiple exact local minimizers of a possibly nonconvex penalized loss function in a certain main branch of the graph of critical points of the penalized loss. Its output is a continuous piecewise linear path encompassing from the origin for infinite penalty to a least squares solution for zero penalty. We prove that at a universal penalty level, the MC+ has high probability of matching the signs of the unknowns, and thus correct selection, without assuming the strong irrepresentable condition required by the LASSO. This selection consistency applies to the case of p >> n, and is proved to hold for exactly the MC+ solution among possibly many local minimizers. We prove that the MC+ attains certain minimax convergence rates in probability for the estimation of regression coefficients in e, balls. We use the SURE method to derive degrees of freedom and C-p-type risk estimates for general penalized LSE, including the LASSO and MC+ estimators, and prove their unbiasedness. Based on the estimated degrees of freedom, we propose an estimator of the noise level for proper choice of the penalty level. For full rank designs and general sub-quadratic penalties, we provide necessary and sufficient conditions for the continuity of the penalized LSE. Simulation results overwhelmingly support our claim Of Superior variable selection properties and demonstrate the computational efficiency of the proposed method.
引用
收藏
页码:894 / 942
页数:49
相关论文
共 54 条
  • [1] Akaike H., 1973, 2 INT S INFORM THEOR, P267
  • [2] Regularization of wavelet approximations - Rejoinder
    Antoniadis, A
    Fan, J
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (455) : 964 - 967
  • [3] Bach F. R., 2008, P 25 INT C MACH LEAR, P33, DOI DOI 10.1145/1390156.1390161
  • [4] Sparsity oracle inequalities for the Lasso
    Bunea, Florentina
    Tsybakov, Alexandre
    Wegkamp, Marten
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2007, 1 : 169 - 194
  • [5] Decoding by linear programming
    Candes, EJ
    Tao, T
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2005, 51 (12) : 4203 - 4215
  • [6] Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
  • [7] CHEN S., 1994, BASIS PURSUIT
  • [8] Davidson KR, 2001, HANDBOOK OF THE GEOMETRY OF BANACH SPACES, VOL 1, P317, DOI 10.1016/S1874-5849(01)80010-3
  • [9] DONOHO DL, 1992, J ROY STAT SOC B MET, V54, P41
  • [10] IDEAL SPATIAL ADAPTATION BY WAVELET SHRINKAGE
    DONOHO, DL
    JOHNSTONE, IM
    [J]. BIOMETRIKA, 1994, 81 (03) : 425 - 455