Substantial effective sample sizes were required for external validation studies of predictive logistic regression models

被引:492
作者
Vergouwe, Y [1 ]
Steyerberg, EW [1 ]
Eijkemans, MJC [1 ]
Habbema, JDF [1 ]
机构
[1] Erasmus MC, Dept Publ Hlth, NL-3000 DR Rotterdam, Netherlands
关键词
external validation; performance; prediction models; sample size; simulations;
D O I
10.1016/j.jclinepi.2004.06.017
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background and Objectives: The performance of a prediction model is usually worse in external validation data compared to the development data. We aimed to determine at which effective sample sizes (i.e., number of events) relevant differences in model performance can be detected with adequate power. Methods: We used a logistic regression model to predict the probability that residual masses of patients treated for metastatic testicular cancer contained only benign tissue. We performed standard power calculations and Monte Carlo simulations to estimate the numbers of events that are required to detect several types of model invalidity with 80% power at the 5% significance level. Results: A validation sample with 111 events was required to detect that a model predicted too high probabilities, when predictions were on average 1.5 times too high on the odds scale. A decrease in discriminative ability of the model, indicated by a decrease in the c-statistic from 0.83 to 0.73, required 81 to 106 events, depending on the specific scenario. Conclusion: We suggest a minimum of 100 events and 100 nonevents for external validation samples. Specific hypotheses may, however, require substantially higher effective sample sizes to obtain adequate power. © 2005 Elsevier Inc. All rights reserved.
引用
收藏
页码:475 / 483
页数:9
相关论文
共 42 条
[1]  
Altman DG, 2000, STAT MED, V19, P453, DOI 10.1002/(SICI)1097-0258(20000229)19:4<453::AID-SIM350>3.3.CO
[2]  
2-X
[3]  
ARKES HR, 1995, MED DECIS MAKING, V15, P120
[4]   INABILITY TO PREDICT RELAPSE IN ACUTE ASTHMA [J].
CENTOR, RM ;
YARBROUGH, B ;
WOOD, JP .
NEW ENGLAND JOURNAL OF MEDICINE, 1984, 310 (09) :577-580
[5]   MODEL UNCERTAINTY, DATA MINING AND STATISTICAL-INFERENCE [J].
CHATFIELD, C .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1995, 158 :419-466
[6]  
COPAS JB, 1983, J R STAT SOC B, V45, P311
[7]  
COX DR, 1958, BIOMETRIKA, V45, P562, DOI 10.1093/biomet/45.3-4.562
[8]   Development and validation of a prognostic model to predict the length of survival in patients with carcinomas of an unknown primary site [J].
Culine, S ;
Kramar, A ;
Saghatchian, M ;
Bugat, R ;
Lesimple, T ;
Lortholary, A ;
Merrouche, Y ;
Laplanche, A ;
Fizazi, K .
JOURNAL OF CLINICAL ONCOLOGY, 2002, 20 (24) :4679-4683
[9]   EVALUATION OF THE LEEDS PROGNOSTIC SCORE FOR SEVERE HEAD-INJURY [J].
FELDMAN, Z ;
CONTANT, CF ;
ROBERTSON, CS ;
NARAYAN, RK ;
GROSSMAN, RG .
LANCET, 1991, 337 (8755) :1451-1453
[10]  
GIBSON RM, 1989, LANCET, V2, P369