Asymptotic properties of bridge estimators in sparse high-dimensional regression models

被引:337
作者
Huang, Jian [1 ]
Horowitz, Joel L. [2 ]
Ma, Shuangge [3 ]
机构
[1] Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52242 USA
[2] Northwestern Univ, Dept Econ, Evanston, IL 60208 USA
[3] Yale Univ, Dept Epidemiol & Publ Hlth, Div Biostat, New Haven, CT 06520 USA
基金
美国国家科学基金会;
关键词
penalized regression; high-dimensional data; variable selection; asymptotic normality; oracle property;
D O I
10.1214/009053607000000875
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We Study the asymptotic properties of bridge estimators in sparse, high-dimensional, linear regression models when the number of covariates may increase to infinity with the sample size. We are particularly interested in the use of bridge estimators to distinguish between covariates whose coefficients are zero and covariates whose coefficients are nonzero. We show that under appropriate conditions, bridge estimators correctly select covariates with nonzero coefficients with probability converging to one and that the estimators of nonzero coefficients have the same asymptotic distribution that they would have if the zero coefficients were known in advance. Thus, bridge estimators have an oracle property in the sense of Fan and Li [J. Amer. Statist. Assoc. 96 (2001) 1348-1360] and Fan and Peng [Ann. Statist. 32 (2004) 928-961]. In general, the oracle property holds only if the number of covariates is smaller than the sample size. However, under a partial orthogonality condition in which the covariates of the zero coefficients are uncorrelated or weakly correlated with the covariates of nonzero coefficients, we show that marginal bridge estimators can correctly distinguish between covariates with nonzero and zero coefficients with probability converging to one even when the number of covariates is greater than the sample size.
引用
收藏
页码:587 / 613
页数:27
相关论文
共 24 条
[1]   Prediction by supervised principal components [J].
Bair, E ;
Hastie, T ;
Paul, D ;
Tibshirani, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :119-137
[2]   Boosting for high-dimensional linear models [J].
Buhlmann, Peter .
ANNALS OF STATISTICS, 2006, 34 (02) :559-583
[3]  
FAN J, 2006, SURE INDEPENDENCE SC
[4]  
Fan J., 2006, INT C MATHEMATICIANS, VIII, P595, DOI DOI 10.4171/022-3/31
[5]  
Fan J., 1997, J Ital Stat Assoc, V6, P131, DOI DOI 10.1007/BF03178906
[6]   Semilinear high-dimensional model for normalization of microarray data: A theoretical analysis and partial consistency [J].
Fan, JQ ;
Peng, H ;
Huang, T .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (471) :781-796
[7]   Nonconcave penalized likelihood with a diverging number of parameters [J].
Fan, JQ ;
Peng, H .
ANNALS OF STATISTICS, 2004, 32 (03) :928-961
[8]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[9]   A STATISTICAL VIEW OF SOME CHEMOMETRICS REGRESSION TOOLS [J].
FRANK, IE ;
FRIEDMAN, JH .
TECHNOMETRICS, 1993, 35 (02) :109-135
[10]   Penalized regressions: The bridge versus the lasso [J].
Fu, WJJ .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 1998, 7 (03) :397-416