A SELECTIVE OVERVIEW OF VARIABLE SELECTION IN HIGH DIMENSIONAL FEATURE SPACE

被引:150
作者
Fan, Jianqing [1 ]
Lv, Jinchi [2 ]
机构
[1] Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USA
[2] Univ So Calif, Informat & Operat Management Dept, Marshall Sch Business, Los Angeles, CA 90089 USA
基金
美国国家科学基金会;
关键词
Dimensionality reduction; folded-concave penalty; high dimensionality; LASSO; model selection; oracle property; penalized least squares; penalized likelihood; SCAD; sure independence screening; sure screening; variable selection; NONCONCAVE PENALIZED LIKELIHOOD; CLIPPED ABSOLUTE DEVIATION; FALSE DISCOVERY RATE; MODEL SELECTION; DANTZIG SELECTOR; STATISTICAL ESTIMATION; UNCERTAINTY PRINCIPLES; SMALLEST EIGENVALUE; SHRUNKEN CENTROIDS; CLASS PREDICTION;
D O I
暂无
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
070103 [概率论与数理统计]; 140311 [社会设计与社会创新];
摘要
High dimensional statistical problems arise from diverse fields of scientific research and technological development Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries The traditional idea of best subset selection methods, which can be regarded us it specific form of penalized likelihood, is computationally too expensive for many modern statistical applications Other forms of penalized likelihood methods have been successfully developed over the last. decade to cope with high dimensionality They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a. brief account of the recent developments of theory; methods: and Implementations for high dimensional variable selection What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the. field The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis Oil independence screening and two-scale methods
引用
收藏
页码:101 / 148
页数:48
相关论文
共 135 条
[1]
Adapting to unknown sparsity by controlling the false discovery rate [J].
Abramovich, Felix ;
Benjamini, Yoav ;
Donoho, David L. ;
Johnstone, Iain M. .
ANNALS OF STATISTICS, 2006, 34 (02) :584-653
[2]
NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]
Akaike H., 1973, 2 INT S INFORM THEOR, P267
[4]
[Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[5]
[Anonymous], 2000, AMS C MATH CHALL 21
[6]
[Anonymous], 2001, The Concentration of Measure Phenomenon
[7]
[Anonymous], 1983, Generalized Linear Models
[8]
Regularization of wavelet approximations - Rejoinder [J].
Antoniadis, A ;
Fan, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (455) :964-967
[9]
Antoniadis A, 1996, SCAND J STAT, V23, P313
[10]
Effective dimension reduction methods for tumor classification using gene expression data [J].
Antoniadis, A ;
Lambert-Lacroix, S ;
Leblanc, F .
BIOINFORMATICS, 2003, 19 (05) :563-570