Predicting hospital mortality for patients in the intensive care unit: A comparison of artificial neural networks with logistic regression models

被引:112
作者
Clermont, G [1 ]
Angus, DC
DiRusso, SM
Griffin, M
Linde-Zwirble, WT
机构
[1] Univ Pittsburgh, Dept Anesthesiol & Crit Care Med, Crit Care Med Div, Pittsburgh, PA 15260 USA
[2] Univ Pittsburgh, Ctr Res Hlth Care, Pittsburgh, PA USA
[3] New York Med Coll, Dept Surg, Valhalla, NY 10595 USA
[4] Hlth Proc Management, Doylestown, PA USA
关键词
intensive care unit; mortality; Acute Physiology and Chronic Health Evaluation; outcome prediction; modeling; severity scoring; artificial neural network; goodness-of-fit; discrimination; calibration;
D O I
10.1097/00003246-200102000-00012
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
Objective: Logistic regression (LR), commonly used for hospital mortality prediction, has limitations. Artificial neural networks (ANNs) have been proposed as an alternative. We compared the performance of these approaches by using stepwise reductions in sample size. Design: Prospective cohort study. Setting: Seven intensive care units (ICU) at one tertiary care center. Patients: Patients were 1,647 ICU admissions for whom first-day Acute Physiology and Chronic Health Evaluation III variables were collected. Interventions: None. Measurements and Main Results: We constructed LR and ANN models on a random set of 1,200 admissions (development set) and used the remaining 447 as the validation set. We repeated model construction on progressively smaller development sets (800, 400, and 200 admissions) and retested on the original validation set (n = 447). For each development set, we constructed models from two LR and two ANN architectures, organizing the independent variables differently. With the 1,200-admission development set, all models had good fit and discrimination on the Validation set, where fit was assessed by the Hosmer-Lemeshow C statistic (range, 10.6-15.3; p greater than or equal to .05) and standardized mortality ratio (SMR) (range, 0.93 [95% confidence interval, 0.79-1.15] to 7.09 [95% confidence interval, 0.89-1.38]), and discrimination was assessed by the area under the receiver operating characteristic curve (range, 0.80-0.84). As development set sample size decreased, model performance on the validation set deteriorated rapidly, although the ANNs retained marginally better fit at 800 (best C statistic was 26.3 [p = .0009] and 13.1 [p =.11] for the LR and ANN models). Below 800, fit was poor with both approaches, with high C statistics ranging from 22.8 [p < .004] to 633 [p < .00011] and highly biased SMRs (seven of the eight models below 800 had SMRs of <0.85, with an upper confidence interval of <1). Discrimination ranged from 0.74 to 0.84 below 800. Conclusions: When sample size is adequate, LR and ANN models have similar performance. However, development sets of less than or equal to 800 were generally inadequate. This is concerning, given typical sample sizes used for individual ICU mortality prediction.
引用
收藏
页码:291 / 296
页数:6
相关论文
共 49 条
[1]   Short-term and long-term outcome prediction with the Acute Physiology and Chronic Health Evaluation II system after orthotopic liver transplantation [J].
Angus, DC ;
Clermont, G ;
Kramer, DJ ;
Linde-Zwirble, WT ;
Pinsky, MR .
CRITICAL CARE MEDICINE, 2000, 28 (01) :150-156
[2]   Risk prediction: Judging the judges [J].
Angus, DC ;
Pinsky, MR .
INTENSIVE CARE MEDICINE, 1997, 23 (04) :363-365
[3]   USE OF AN ARTIFICIAL NEURAL NETWORK FOR THE DIAGNOSIS OF MYOCARDIAL-INFARCTION [J].
BAXT, WG .
ANNALS OF INTERNAL MEDICINE, 1991, 115 (11) :843-848
[4]   Developing and testing changes in delivery of care [J].
Berwick, DM .
ANNALS OF INTERNAL MEDICINE, 1998, 128 (08) :651-656
[5]   A COMPARISON OF STATISTICAL AND CONNECTIONIST MODELS FOR THE PREDICTION OF CHRONICITY IN A SURGICAL INTENSIVE-CARE UNIT [J].
BUCHMAN, TG ;
KUBOS, KL ;
SEIDLER, AJ ;
SIEGFORTH, MJ .
CRITICAL CARE MEDICINE, 1994, 22 (05) :750-762
[6]   Effects of organizational change in the medical intensive care unit of a teaching hospital - A comparison of 'open' and 'closed' formats [J].
Carson, SS ;
Stocking, C ;
Podsadecki, T ;
Christenson, J ;
Pohlman, A ;
MacRae, S ;
Jordan, J ;
Humphrey, H ;
Siegler, M ;
Hall, J .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1996, 276 (04) :322-328
[7]   Cooperation: The foundation of improvement [J].
Clemmer, TP ;
Spuhler, VJ ;
Berwick, DM ;
Nolan, TW .
ANNALS OF INTERNAL MEDICINE, 1998, 128 (12) :1004-1009
[8]  
Clermont G., 1998, Annals Academy of Medicine Singapore, V27, P397
[9]   RELIABILITY OF A MEASURE OF SEVERITY OF ILLNESS - ACUTE PHYSIOLOGY OF CHRONIC HEALTH EVALUATION .2. [J].
DAMIANO, AM ;
BERGNER, M ;
DRAPER, EA ;
KNAUS, WA ;
WAGNER, DP .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 1992, 45 (02) :93-101
[10]  
Dassen W R, 1990, J Electrocardiol, V23 Suppl, P200, DOI 10.1016/0022-0736(90)90102-8