VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS

被引:391
作者
Huang, Jian [1 ]
Horowitz, Joel L. [2 ]
Wei, Fengrong [3 ]
机构
[1] Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52242 USA
[2] Northwestern Univ, Dept Econ, Evanston, IL 60208 USA
[3] Univ W Georgia, Dept Math, Carrollton, GA 30118 USA
基金
美国国家科学基金会;
关键词
Adaptive group Lasso; component selection; high-dimensional data; nonparametric regression; selection consistency; NONCONCAVE PENALIZED LIKELIHOOD; COMPONENT SELECTION; GENE-EXPRESSION; ADAPTIVE LASSO; REGRESSION;
D O I
10.1214/09-AOS781
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is "small" relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method.
引用
收藏
页码:2282 / 2313
页数:32
相关论文
共 42 条
  • [1] [Anonymous], 2006, Journal of the Royal Statistical Society, Series B
  • [2] Regularization of wavelet approximations - Rejoinder
    Antoniadis, A
    Fan, J
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (455) : 964 - 967
  • [3] Bach FR, 2008, J MACH LEARN RES, V9, P1179
  • [4] Boor C.D., 2001, A Practical Guide to Splines
  • [5] Sparsity oracle inequalities for the Lasso
    Bunea, Florentina
    Tsybakov, Alexandre
    Wegkamp, Marten
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2007, 1 : 169 - 194
  • [6] CHEN J, 2009, EXTENDED BIC SMALL N
  • [7] Extended Bayesian information criteria for model selection with large model spaces
    Chen, Jiahua
    Chen, Zehua
    [J]. BIOMETRIKA, 2008, 95 (03) : 759 - 771
  • [8] Homozygosity mapping with SNP arrays identifies TRIM32 an E3 ubiquitin ligase, as a Bardet-Biedl syndrome gene (BBS11)
    Chiang, AP
    Beck, JS
    Yen, HJ
    Tayeh, MK
    Scheetz, TE
    Swiderski, RE
    Nishimura, DY
    Braun, TA
    Kim, KYA
    Huang, J
    Elbedour, K
    Carmi, R
    Slusarski, DC
    Casavant, TL
    Stone, EM
    Sheffield, VC
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (16) : 6287 - 6292
  • [9] Least angle regression - Rejoinder
    Efron, B
    Hastie, T
    Johnstone, I
    Tibshirani, R
    [J]. ANNALS OF STATISTICS, 2004, 32 (02) : 494 - 499
  • [10] Nonconcave penalized likelihood with a diverging number of parameters
    Fan, JQ
    Peng, H
    [J]. ANNALS OF STATISTICS, 2004, 32 (03) : 928 - 961