Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models

被引:438
作者
Fan, Jianqing [1 ]
Feng, Yang [2 ]
Song, Rui [3 ]
机构
[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA
[2] Columbia Univ, Dept Stat, New York, NY 10027 USA
[3] Colorado State Univ, Dept Stat, Ft Collins, CO 80523 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Additive model; Independent learning; Nonparametric independence screening; Nonparametric regression; Sparsity; Sure independence screening; Variable selection; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; REGRESSION;
D O I
10.1198/jasa.2011.tm09779
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A variable screening procedure via correlation learning was proposed by Fan and Lv (2008) to reduce dimensionality in sparse ultra-high-dimensional models. Even when the true model is linear, the marginal regression can be highly nonlinear. To address this issue, we further extend the correlation learning to marginal nonparametric learning. Our nonparametric independence screening (NIS) is a specific type of sure independence screening. We propose several closely related variable screening procedures. We show that with general nonparametric models, under some mild technical conditions, the proposed independence screening methods have a sure screening property. The extent to which the dimensionality can be reduced by independence screening is also explicitly quantified. As a methodological extension, we also propose a data-driven thresholding and an iterative nonparametric independence screening (INIS) method to enhance the finite- sample performance for fitting sparse additive models. The simulation results and a real data analysis demonstrate that the proposed procedure works well with moderate sample size and large dimension and performs better than competing methods.
引用
收藏
页码:544 / 557
页数:14
相关论文
共 39 条
[1]  
[Anonymous], 1966, APPL REGRESSION ANAL
[2]  
[Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[3]   Regularization of wavelet approximations - Rejoinder [J].
Antoniadis, A ;
Fan, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (455) :964-967
[4]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[5]   Homozygosity mapping with SNP arrays identifies TRIM32 an E3 ubiquitin ligase, as a Bardet-Biedl syndrome gene (BBS11) [J].
Chiang, AP ;
Beck, JS ;
Yen, HJ ;
Tayeh, MK ;
Scheetz, TE ;
Swiderski, RE ;
Nishimura, DY ;
Braun, TA ;
Kim, KYA ;
Huang, J ;
Elbedour, K ;
Carmi, R ;
Slusarski, DC ;
Casavant, TL ;
Stone, EM ;
Sheffield, VC .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (16) :6287-6292
[6]  
Efroymson M.A., 1960, MATH METHODS DIGITAL, P191
[7]  
Fan J., 1997, J. Italian Stat. Soc, V6, P131, DOI [10.1007/BF03178906, DOI 10.1007/BF03178906]
[8]  
Fan J., 2009, NONCONCAVE PEN UNPUB
[9]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[10]   SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY [J].
Fan, Jianqing ;
Song, Rui .
ANNALS OF STATISTICS, 2010, 38 (06) :3567-3604