Predicting the cost of illness: A comparison of alternative models applied to stroke

被引:77
作者
Lipscomb, J
Ancukiewicz, M
Parmigiani, G
Hasselblad, V
Samsa, G
Matchar, DB
机构
[1] Duke Univ, Sanford Inst Publ Policy, Durham, NC 27708 USA
[2] Duke Univ, Ctr Clin Hlth Policy Res, Durham, NC USA
[3] Duke Univ, Dept Community & Family Med, Durham, NC USA
[4] Duke Univ, Ctr Hlth Policy Law & Management, Durham, NC USA
[5] Harvard Univ, Sch Med, Dept Radiat Oncol, Boston, MA USA
[6] Duke Univ, Inst Stat & Decis Sci, Durham, NC 27706 USA
[7] Duke Univ, Dept Med, Durham, NC USA
[8] VAMC, Ctr Hlth Serv Res Primary Care, Durham, England
关键词
cost analysis; cost of illness; statistical models; econometric models; stroke; cerebrovascular disease;
D O I
10.1177/0272989X98018002S07
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Predictions of cost over well-defined time horizons are frequently required in the analysis of clinical trials and social experiments, for decision models investigating the cost-effectiveness of interventions, and for macro-level estimates of the resource impact of disease. With rare exceptions, cost predictions used in such applications continue to take the form of deterministic point estimates. However, the growing availability of large administrative and clinical data sets offers new opportunities for a more general approach to disease cost forecasting: the estimation of multivariable cost functions that yield predictions at the individual level, conditional on intervention(s), patient characteristics, and other factors. This raises the fundamental question of how to choose the "best" cost model for a given application. The central purpose of this paper is to demonstrate how to evaluate competing models on the basis of predictive validity. This concept is operationalized according to three alternative criteria: 1) root mean square error (RMSE), for evaluating predicted mean cost; 2) mean absolute error (MAE), for evaluating predicted median cost; and 3) a logarithmic scoring rule (log score), an information-theoretic index for evaluating the entire predictive distribution of cost. To illustrate these concepts, the authors conducted a split-sample analysis of data from a national sample of Medicare-covered patients hospitalized for ischemic stroke in 1991 and followed to the end of 1993. Using test and training samples of about 500,000 observations each, they investigated five models: single-equation linear models, with and without log transform of cost; two-part (mixture) models, with and without log transform, to directly address the problem of zero-cost observations; and a Cox proportional-hazards model stratified by time interval. For deriving the predictive distribution of cost, the log transformed two-part and proportional-hazards models are superior. For deriving the predicted mean or median cost, these two models and the commonly used log-transformed linear model all perform about the same. The untransformed models are dominated in every instance. The approaches to model selection illustrated here can be applied across a wide range of settings.
引用
收藏
页码:S39 / S56
页数:18
相关论文
共 34 条
[1]  
[Anonymous], 1980, SCREENING CANC THEOR
[2]  
[Anonymous], 1996, COST EFFECTIVENESS H
[3]  
Bernardo Jose M, 2009, BAYESIAN THEORY, V405
[4]   COST-EFFECTIVENESS OF BREAST-CANCER SCREENING - PRELIMINARY-RESULTS OF A SYSTEMATIC REVIEW OF THE LITERATURE [J].
BROWN, ML ;
FINTOR, L .
BREAST CANCER RESEARCH AND TREATMENT, 1993, 25 (02) :113-118
[5]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[6]   CONFIDENCE-INTERVAL FOR PREDICTIONS FROM A LOGARITHMIC MODEL [J].
DADKHAH, KM .
REVIEW OF ECONOMICS AND STATISTICS, 1984, 66 (03) :527-528
[7]  
DOUBILET P, 1985, MED DECIS MAKING, V5, P167
[9]  
DUDLEY RA, 1993, J CLIN EPIDEMIOL, V46, P261
[10]  
Efron B, 1994, INTRO BOOTSTRAP, DOI DOI 10.1201/9780429246593