Identification of immune correlates of protection in Shigella infection by application of machine learning

被引:22
作者
Arevalillo, Jorge M. [1 ]
Sztein, Marcelo B. [2 ,3 ]
Kotloff, Karen L. [2 ,3 ]
Levine, Myron M. [2 ,3 ]
Simon, Jakub K. [4 ]
机构
[1] Univ Nacl Educ Distancia, Dept Stat & Operat Res, Paseo Senda del Rey 9, E-28040 Madrid, Spain
[2] Univ Maryland, Sch Med, Ctr Vaccine Dev, Dept Pediat, Baltimore, MD 21201 USA
[3] Univ Maryland, Sch Med, Ctr Vaccine Dev, Dept Med, Baltimore, MD 21201 USA
[4] Merck & Co Inc, Kenilworth, NJ USA
基金
美国国家卫生研究院;
关键词
Classification and Regression Trees; Random Forests algorithm; Logistic regression; Correlate of protection; Shigella; CLASSIFICATION TREE; RANDOM FORESTS; VACCINE; VIRUS; MODEL; ANTIBODY; IMMUNOGENICITY; STABILITY; EFFICACY; IMPROVE;
D O I
10.1016/j.jbi.2017.08.005
中图分类号
TP39 [计算机的应用];
学科分类号
080201 [机械制造及其自动化];
摘要
Background: Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. Methods: The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. Results: The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Conclusion: Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 9
页数:9
相关论文
共 51 条
[1]
Integrating classification trees with local logistic regression in Intensive Care prognosis [J].
Abu-Hanna, A ;
de Keizer, N .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2003, 29 (1-2) :5-23
[2]
NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]
Profiling and classification tree applied to renal epithelial tumours [J].
Allory, Y. ;
Bazille, C. ;
Vieillefond, A. ;
Molinie, V. ;
Cochand-Priollet, B. ;
Cussenot, O. ;
Callard, P. ;
Sibony, M. .
HISTOPATHOLOGY, 2008, 52 (02) :158-166
[4]
Uncovering Bivariate Interactions in High Dimensional Data Using Random Forests with Data Augmentation [J].
Arevalillo, Jorge M. ;
Navarro, Hilario .
FUNDAMENTA INFORMATICAE, 2011, 113 (02) :97-115
[5]
SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[6]
Stability and aggregation of ranked gene lists [J].
Boulesteix, Anne-Laure ;
Slawski, Martin .
BRIEFINGS IN BIOINFORMATICS, 2009, 10 (05) :556-568
[7]
Secretory IgA: designed for anti-microbial defense [J].
Brandtzaeg, Per .
FRONTIERS IN IMMUNOLOGY, 2013, 4
[8]
Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]
Classification tree analysis: a statistical tool to investigate risk factor interactions with an example for colon cancer (United States) [J].
Camp, NJ ;
Slattery, ML .
CANCER CAUSES & CONTROL, 2002, 13 (09) :813-823
[10]
Carpenter J, 2000, STAT MED, V19, P1141, DOI 10.1002/(SICI)1097-0258(20000515)19:9<1141::AID-SIM479>3.0.CO