Simplifying a prognostic model: a simulation study based on clinical data

被引:103
作者
Ambler, G
Brady, AR
Royston, P
机构
[1] UCL, Dept Stat Sci, London WC1E 7HB, England
[2] Intens Care Natl Audit & Res Ctr, London WC1H 9HR, England
[3] MRC, Clin Trials Unit, London NW1 2DA, England
关键词
prognostic models; variable selection; penalisation; lassos; ROC; AIC;
D O I
10.1002/sim.1422
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Prognostic models are designed to predict a clinical outcome in individuals or groups of individuals with a particular disease or condition. To avoid bias many researchers advocate the use of full models developed by prespecifying predictors. Variable selection is not employed and the resulting models may be large and complicated. In practice more parsimonious models that retain most of the prognostic information may be preferred. We investigate the effect on various performance measures, including mean square error and prognostic classification, of three methods for estimating full models (including penalized estimation and Tibshirani's lasso) and consider two methods (backwards elimination and a new proposal called stepdown) for simplifying full models. Simulation studies based on two medical data sets suggest that simplified models can be found that perform nearly as well as, or sometimes even better than, full models. Optimizing the Akaike information criterion appears to be appropriate for choosing the degree of simplification. Copyright (C) 2002 John Wiley Sons, Ltd.
引用
收藏
页码:3803 / 3822
页数:20
相关论文
共 28 条
[11]  
LECESSIE S, 1992, APPL STAT-J ROY ST C, V41, P191
[12]   VALIDATION TECHNIQUES FOR LOGISTIC-REGRESSION MODELS [J].
MILLER, ME ;
HUI, SL ;
TIERNEY, WM .
STATISTICS IN MEDICINE, 1991, 10 (08) :1213-1226
[13]   Importance of events per independent variable in proportional hazards regression analysis .2. Accuracy and precision of regression estimates [J].
Peduzzi, P ;
Concato, J ;
Feinstein, AR ;
Holford, TR .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 1995, 48 (12) :1503-1510
[14]  
Powell JT, 1998, LANCET, V352, P1649
[15]   REGRESSION USING FRACTIONAL POLYNOMIALS OF CONTINUOUS COVARIATES - PARSIMONIOUS PARAMETRIC MODELING [J].
ROYSTON, P ;
ALTMAN, DG .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1994, 43 (03) :429-467
[16]   The lognormal distribution as a model for survival time in cancer, with an emphasis on prognostic factors [J].
Royston, P .
STATISTICA NEERLANDICA, 2001, 55 (01) :89-104
[17]   The use of resampling methods to simplify regression models in medical statistics [J].
Sauerbrei, W .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1999, 48 :313-329
[18]   Building multivariable prognostic and diagnostic models: transformation of the predictors by using fractional polynomials [J].
Sauerbrei, W ;
Royston, P .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1999, 162 :71-94
[19]  
Schmoor C, 1996, STAT MED, V15, P263, DOI 10.1002/(SICI)1097-0258(19960215)15:3<263::AID-SIM165>3.0.CO
[20]  
2-K