Decision tree classifiers for automated medical diagnosis

被引:150
作者
Azar, Ahmad Taher [1 ]
El-Metwally, Shereen M. [2 ]
机构
[1] MUST, Fac Engn, 6th Of October City, Egypt
[2] Cairo Univ, Syst & Biomed Engn Dept, Giza, Egypt
关键词
Computer-aided diagnosis (CAD); Decision support systems (DSS); Decision tree classification; Single decision tree; Boosted decision tree; Decision tree forest; k-fold cross-validation; COMPUTER-AIDED DIAGNOSIS; SCREEN-FILM MAMMOGRAPHY; FIELD DIGITAL MAMMOGRAPHY; BREAST-CANCER; STATISTICS; ACCURACY; SURVIVAL; CURVE;
D O I
10.1007/s00521-012-1196-7
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Decision support systems help physicians and also play an important role in medical decision-making. They are based on different models, and the best of them are providing an explanation together with an accurate, reliable and quick response. This paper presents a decision support tool for the detection of breast cancer based on three types of decision tree classifiers. They are single decision tree (SDT), boosted decision tree (BDT) and decision tree forest (DTF). Decision tree classification provides a rapid and effective method of categorizing data sets. Decision-making is performed in two stages: training the classifiers with features from Wisconsin breast cancer data set, and then testing. The performance of the proposed structure is evaluated in terms of accuracy, sensitivity, specificity, confusion matrix and receiver operating characteristic (ROC) curves. The results showed that the overall accuracies of SDT and BDT in the training phase achieved 97.07 % with 429 correct classifications and 98.83 % with 437 correct classifications, respectively. BDT performed better than SDT for all performance indices than SDT. Value of ROC and Matthews correlation coefficient (MCC) for BDT in the training phase achieved 0.99971 and 0.9746, respectively, which was superior to SDT classifier. During validation phase, DTF achieved 97.51 %, which was superior to SDT (95.75 %) and BDT (97.07 %) classifiers. Value of ROC and MCC for DTF achieved 0.99382 and 0.9462, respectively. BDT showed the best performance in terms of sensitivity, and SDT was the best only considering speed.
引用
收藏
页码:2387 / 2403
页数:17
相关论文
共 90 条
[1]
Alsabti K., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P2
[2]
Shape quantization and recognition with randomized trees [J].
Amit, Y ;
Geman, D .
NEURAL COMPUTATION, 1997, 9 (07) :1545-1588
[3]
Ankerst M, 1999, P INT C KNOWL DISC D
[4]
[Anonymous], DTREG PREDICTIVE MOD
[5]
Predicting the outcome of construction litigation using boosted decision trees [J].
Arditi, D ;
Pulket, T .
JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2005, 19 (04) :387-393
[6]
Microcalcification Detection in Digital Mammograms using Novel Filter bank [J].
Balakumaran, T. ;
Vennila, I. L. A. ;
Shankar, C. Gowri .
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE AND EXHIBITION ON BIOMETRICS TECHNOLOGY, 2010, 2 :272-282
[7]
Digital mammography: what do we and what don't we know? [J].
Bick, Ulrich ;
Diekmann, Felix .
EUROPEAN RADIOLOGY, 2007, 17 (08) :1931-1942
[8]
SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[9]
Boyle P., 2008, World Cancer Report 2008
[10]
Bradford J. P., 1998, Machine Learning: ECML-98. 10th European Conference on Machine Learning. Proceedings, P131, DOI 10.1007/BFb0026682