Type 2 Machine Learning: An Effective Hybrid Prediction Model for Early Type 2 Diabetes Detection

被引:15
作者
Albahli, Saleh [1 ]
机构
[1] Qassim Univ, Coll Comp, Dept Informat Technol, Buraydah, Saudi Arabia
关键词
Predictive Model; Machine Learning; Classification; Logistic Regression; K-Means; Feature Selection; Diabetes Identification;
D O I
10.1166/jmihi.2020.3000
中图分类号
Q [生物科学];
学科分类号
090105 [作物生产系统与生态工程];
摘要
Importance: Diabetes is a chronic disease that can cause long term damage to various parts of the body. To prevent diabetic complications, different attempts integrating machine learning with medicine have been made for building models to predict whether a patient has diabetes or not, but predicting this disease still has room for improvement. Hybrid prediction model presents a novel method and mostly achieve a much better optimal outcome than single classical machine learning algorithms. Objective: To develop a high accuracy model for different onsets of type 2 diabetes prediction. In this way, the integration between clustering and classification techniques can be improved to help detecting diabetes at an earlier stage without deleting observations with missing values and also decrease insignificant features to get the most related features during data collection. Methods: We implement a noise reduction based technique using Kmeans clustering followed by running the Random forest and XGBoost classifiers to extract the unknown hidden features of the dataset and for more accurate results. Results: Prediction accuracy can be observed by benchmarking our model against up-to-date predictive models and common classification algorithms. With an accuracy of 97.53% by 10 fold cross validation, our T2ML model reaches a better accuracy compared with other experiments reported by other researchers in the literature and over various conventional classification algorithms.
引用
收藏
页码:1069 / 1075
页数:7
相关论文
共 18 条
[1]
Afzali S, 2018, INT ARAB J INF TECHN, V15, P968
[2]
Han Wu, 2018, Informatics in Medicine Unlocked, V10, P100, DOI 10.1016/j.imu.2017.12.006
[3]
Hashi EK, 2017, 2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), P396, DOI 10.1109/ECACE.2017.7912937
[4]
Huang L., 2019, 5 IEEE INT C CLOUD C, P493
[5]
Iyer A., 2015, INT J DATA MIN KNOWL, V5, P1, DOI [10.5121/ijdkp.2015.5101, DOI 10.5121/IJDKP.2015.5101]
[6]
Jhaldiyal Tarun, 2014, INT J ENG TECHNICAL, V2, P164
[7]
Joshi R., 2017, Int. Res. J. Eng. Technol., V4, P426
[8]
Type 2 diabetes data classification using stacked autoencoders in deep neural networks [J].
Kannadasan, K. ;
Edla, Damodar Reddy ;
Kuppili, Venkatanareshbabu .
CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH, 2019, 7 (04) :530-535
[9]
Karegowda J., 2012, International Journal of computer applications, P45
[10]
Khandegar Anjali, 2017, INT J DIGITAL APPL C, V5, P115