Computational methods for prediction of in vitro effects of new chemical structures

被引:41
作者
Banerjee, Priyanka [1 ,3 ]
Siramshetty, Vishal B. [2 ,4 ]
Drwal, Malgorzata N. [1 ,5 ]
Preissner, Robert [1 ,2 ,4 ]
机构
[1] Charite Univ Med Berlin, Inst Physiol, Struct Bioinformat Grp, Berlin, Germany
[2] Charite Univ Med Berlin, ECRC, Struct Bioinformat Grp, Berlin, Germany
[3] Humboldt Univ, Grad Sch Computat Syst Biol, Berlin, Germany
[4] Free Univ Berlin, Berlin Brandenburg Grad Sch BB3R 3R, Berlin, Germany
[5] Univ Strasbourg, Lab Innovat Therapeut, Illkirch Graffenstaden, France
关键词
Similarity searching; Machine learning; Toxicity prediction; Tox21; challenge; Molecular fingerprints; DRUG DISCOVERY; TOXICITY; MUTAGENICITY; COMBINATION; TOXICOLOGY;
D O I
10.1186/s13321-016-0162-2
中图分类号
O6 [化学];
学科分类号
070301 [无机化学];
摘要
Background: With a constant increase in the number of new chemicals synthesized every year, it becomes important to employ the most reliable and fast in silico screening methods to predict their safety and activity profiles. In recent years, in silico prediction methods received great attention in an attempt to reduce animal experiments for the evaluation of various toxicological endpoints, complementing the theme of replace, reduce and refine. Various computational approaches have been proposed for the prediction of compound toxicity ranging from quantitative structure activity relationship modeling to molecular similarity-based methods and machine learning. Within the "Toxicology in the 21st Century" screening initiative, a crowd-sourcing platform was established for the development and validation of computational models to predict the interference of chemical compounds with nuclear receptor and stress response pathways based on a training set containing more than 10,000 compounds tested in high-throughput screening assays. Results: Here, we present the results of various molecular similarity-based and machine-learning based methods over an independent evaluation set containing 647 compounds as provided by the Tox21 Data Challenge 2014. It was observed that the Random Forest approach based on MACCS molecular fingerprints and a subset of 13 molecular descriptors selected based on statistical and literature analysis performed best in terms of the area under the receiver operating characteristic curve values. Further, we compared the individual and combined performance of different methods. In retrospect, we also discuss the reasons behind the superior performance of an ensemble approach, combining a similarity search method with the Random Forest algorithm, compared to individual methods while explaining the intrinsic limitations of the latter. Conclusions: Our results suggest that, although prediction methods were optimized individually for each modelled target, an ensemble of similarity and machine-learning approaches provides promising performance indicating its broad applicability in toxicity prediction.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 48 条
[1]
CHEMICAL-STRUCTURE, SALMONELLA MUTAGENICITY AND EXTENT OF CARCINOGENICITY AS INDICATORS OF GENOTOXIC CARCINOGENESIS AMONG 222 CHEMICALS TESTED IN RODENTS BY THE UNITED-STATES NCI/NTP [J].
ASHBY, J ;
TENNANT, RW .
MUTATION RESEARCH, 1988, 204 (01) :17-115
[2]
Berthold M. R., 2008, KNIME KONSTANZ INFOR
[3]
Constructive training of probabilistic neural networks [J].
Berthold, MR ;
Diamond, J .
NEUROCOMPUTING, 1998, 19 (1-3) :167-183
[4]
Statistics review 13: Receiver operating characteristic curves [J].
Bewick, V ;
Cheek, L ;
Ball, J .
CRITICAL CARE, 2004, 8 (06) :508-512
[5]
Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[6]
Cross-validation methods [J].
Browne, MW .
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2000, 44 (01) :108-132
[7]
Cheung V., 2002, An Introduction to Probabilistic Neural Networks
[8]
Open Source Bayesian Models. 1. Application to ADME/Tox and Drug Discovery Datasets [J].
Clark, Alex M. ;
Dole, Krishna ;
Coulon-Spektor, Anna ;
McNutt, Andrew ;
Grass, George ;
Freundlich, Joel S. ;
Reynolds, Robert C. ;
Ekins, Sean .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2015, 55 (06) :1231-1245
[9]
Molecular similarity-based predictions of the Tox21 screening outcome [J].
Drwal, Malgorzata N. ;
Siramshetty, Vishal B. ;
Banerjee, Priyanka ;
Goede, Andrean ;
Preissner, Robert ;
Dunkel, Mathias .
FRONTIERS IN ENVIRONMENTAL SCIENCE, 2015, 3
[10]
Giuliano K. A., 1995, FEBS LETT, V367, P98