Machine learning models predicting multidrug resistant urinary tract infections using "DsaaS"

被引:32
作者
Mancini, Alessio [1 ,2 ]
Vito, Leonardo [1 ,3 ]
Marcelli, Elisa [4 ]
Piangerelli, Marco [3 ,4 ]
De Leone, Renato [4 ]
Pucciarelli, Sandra [1 ]
Merelli, Emanuela [3 ]
机构
[1] Univ Camerino, Sch Biosci & Vet Med, Camerino, Italy
[2] ASUR Marche AV2, Operat Unit Clin Pathol, Senigallia, Italy
[3] Univ Camerino, Sch Sci & Technol, Comp Sci Div, Camerino, Italy
[4] Univ Camerino, Sch Sci & Technol, Math Div, Camerino, Italy
基金
欧盟地平线“2020”;
关键词
Machine learning; Classification; Regression; Data science pipeline; Antibiotic stewardship; Multi drug resistance; Nosocomial infection; EPIDEMIOLOGY; MANAGEMENT; GUIDELINES;
D O I
10.1186/s12859-020-03566-7
中图分类号
Q5 [生物化学];
学科分类号
070307 [化学生物学];
摘要
Background: The scope of this work is to build a Machine Learning model able to predict patients risk to contract a multidrug resistant urinary tract infection (MDR UTI) after hospitalization. To achieve this goal, we used different popular Machine Learning tools. Moreover, we integrated an easy-to-use cloud platform, called DSaaS(Data Science as a Service), well suited for hospital structures, where healthcare operators might not have specific competences in using programming languages but still, they do need to analyze data as a continuous process. Moreover,DSaaS allows the validation of data analysis models based on supervised Machine Learning regression and classification algorithms. Results: We used DSaaS on a real antibiotic stewardship dataset to make predictions about antibiotic resistance in the Clinical Pathology Operative Unit of the Principe di Piemonte Hospital in Senigallia, Marche, Italy. Data related to a total of 1486 hospitalized patients with nosocomial urinary tract infection (UTI). Sex, age, age class, ward and time period, were used to predict the onset of a MDR UTI. Machine Learning methods such as Catboost, Support Vector Machine and Neural Networks were utilized to build predictive models. Among the performance evaluators, already implemented in DSaaS,we used accuracy (ACC), area under receiver operating characteristic curve (AUC-ROC), area under Precision-Recall curve (AUC-PRC), F1 score, sensitivity (SEN), specificity and Matthews correlation coefficient (MCC). Catboost exhibited the best predictive results (MCC 0.909; SEN 0.904; F1 score 0.809; AUC-PRC 0.853, AUC-ROC 0.739; ACC 0.717) with the highest value in every metric. Conclusions: the predictive model built with DSaaS may serve as a useful support tool for physicians treating hospitalized patients with a high risk to acquire MDR UTIs. We obtained these results using only five easy and fast predictors accessible for each patient hospitalization. In future,DSaaS will be enriched with more features like unsupervised Machine Learning techniques, streaming data analysis, distributed calculation and big data storage and management to allow researchers to perform a complete data analysis pipeline. The DSaaS prototype is available as a demo at the following address: https://dsaas-demo.shinyapps.io/Server/
引用
收藏
页数:12
相关论文
共 34 条
[1]
A Critical Review for Developing Accurate and Dynamic Predictive Models Using Machine Learning Methods in Medicine and Health Care [J].
Alanazi, Hamdan O. ;
Abdullah, Abdul Hanan ;
Qureshi, Kashif Naseer .
JOURNAL OF MEDICAL SYSTEMS, 2017, 41 (04)
[2]
[Anonymous], 2018, Applied predictive modeling
[3]
Austenfeld M, 2012, J STAT SOFTW, V49, P1
[4]
Implementing an Antibiotic Stewardship Program: Guidelines by the Infectious Diseases Society of America and the Society for Healthcare Epidemiology of America [J].
Barlam, Tamar F. ;
Cosgrove, Sara E. ;
Abbo, Lilian M. ;
MacDougall, Conan ;
Schuetz, Audrey N. ;
Septimus, Edward J. ;
Srinivasan, Arjun ;
Dellit, Timothy H. ;
Falck-Ytter, Yngve T. ;
Fishman, Neil O. ;
Hamilton, Cindy W. ;
Jenkins, Timothy C. ;
Lipsett, Pamela A. ;
Malani, Preeti N. ;
May, Larissa S. ;
Moran, Gregory J. ;
Neuhauser, Melinda M. ;
Newland, Jason G. ;
Ohl, Christopher A. ;
Samore, Matthew H. ;
Seo, Susan K. ;
Trivedi, Kavita K. .
CLINICAL INFECTIOUS DISEASES, 2016, 62 (10) :E51-E77
[5]
An agent-based multilayer architecture for bioinformatics grids [J].
Bartocci, Ezio ;
Cacciagrano, Diletta ;
Cannata, Nicola ;
Corradini, Flavio ;
Merelli, Emanuela ;
Milanesi, Luciano ;
Romano, Paolo .
IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2007, 6 (02) :142-148
[6]
Evidence-based medicine: What it is and what it is not [J].
Bhandari, M ;
Giannoudis, PV .
INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED, 2006, 37 (04) :302-306
[7]
CDC NHSN, 2014, SURV DEF SPEC TYP IN, V36, P309
[8]
SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[9]
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[10]
A successive overrelaxation backpropagation algorithm for neural-network training [J].
De Leone, R ;
Capparuccia, R ;
Merelli, E .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (03) :381-388