Principled sure independence screening for Cox models with ultra-high-dimensional covariates

被引:171
作者
Zhao, Sihai Dave [1 ]
Li, Yi [1 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
关键词
Cox model; Multiple myeloma; Sure independence screening; Ultra-high-dimensional covariates; Variable selection; NONCONCAVE PENALIZED LIKELIHOOD; FALSE DISCOVERY RATE; VARIABLE SELECTION; GENE-EXPRESSION; MULTIPLE-MYELOMA; ADAPTIVE LASSO; REGRESSION; SHRINKAGE;
D O I
10.1016/j.jmva.2011.08.002
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
It is rather challenging for current variable selectors to handle situations where the number of covariates under consideration is ultra-high. Consider a motivating clinical trial of the drug bortezomib for the treatment of multiple myeloma, where overall survival and expression levels of 44760 probesets were measured for each of 80 patients with the goal of identifying genes that predict survival after treatment. This dataset defies analysis even with regularized regression. Some remedies have been proposed for the linear model and for generalized linear models, but there are few solutions in the survival setting and, to our knowledge, no theoretical support. Furthermore, existing strategies often involve tuning parameters that are difficult to interpret. In this paper, we propose and theoretically justify a principled method for reducing dimensionality in the analysis of censored data by selecting only the important covariates. Our procedure involves a tuning parameter that has a simple interpretation as the desired false positive rate of this selection. We present simulation results and apply the proposed procedure to analyze the aforementioned myeloma study. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:397 / 411
页数:15
相关论文
共 37 条
  • [11] SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY
    Fan, Jianqing
    Song, Rui
    [J]. ANNALS OF STATISTICS, 2010, 38 (06) : 3567 - 3604
  • [12] Fan JQ, 2002, ANN STAT, V30, P74
  • [13] Variable selection via nonconcave penalized likelihood and its oracle properties
    Fan, JQ
    Li, RZ
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1348 - 1360
  • [14] Fleming T.R., 2005, COUNTING PROCESSES S
  • [15] PATHWISE COORDINATE OPTIMIZATION
    Friedman, Jerome
    Hastie, Trevor
    Hoefling, Holger
    Tibshirani, Robert
    [J]. ANNALS OF APPLIED STATISTICS, 2007, 1 (02) : 302 - 332
  • [16] ON THE EDGEWORTH EXPANSION AND BOOTSTRAP APPROXIMATION FOR THE COX REGRESSION-MODEL UNDER RANDOM CENSORSHIP
    GU, MG
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1992, 20 (04): : 399 - 414
  • [17] Hadzidimitriou A, 2006, HAEMATOLOGICA, V91, P781
  • [18] Understanding multiple myeloma pathogenesis in the bone marrow to identify new therapeutic targets
    Hideshima, Teru
    Mitsiades, Constantine
    Tonon, Giovanni
    Richardson, Paul G.
    Anderson, Kenneth C.
    [J]. NATURE REVIEWS CANCER, 2007, 7 (08) : 585 - 598
  • [19] Characterization of gene expression of CD34+ cells from normal and myelodysplastic bone marrow
    Hofmann, WK
    de Vos, S
    Komor, M
    Hoelzer, D
    Wachsman, W
    Koeffler, HP
    [J]. BLOOD, 2002, 100 (10) : 3553 - 3560
  • [20] Li HZ, 2008, WILEY SER PROBAB ST, P385