Badapple: promiscuity patterns from noisy evidence

被引:82
作者
Yang, Jeremy J. [1 ]
Ursu, Oleg [1 ]
Lipinski, Christopher A. [2 ]
Sklar, Larry A. [3 ]
Oprea, Tudor I. [1 ]
Bologa, Cristian G. [1 ]
机构
[1] Univ New Mexico, Sch Med, Dept Internal Med, Translat Informat Div, Albuquerque, NM 87131 USA
[2] 10 Connshire Dr, Waterford, CT 06385 USA
[3] Univ New Mexico, Sch Med, Dept Pathol, Ctr Mol Discovery, Albuquerque, NM 87131 USA
来源
JOURNAL OF CHEMINFORMATICS | 2016年 / 8卷
基金
美国国家卫生研究院;
关键词
Drug discovery informatics; High-throughput screening (HTS); Compound promiscuity; Molecular scaffolds; Statistical learning; REACTIVE COMPOUNDS; FALSE POSITIVES; DRUG DISCOVERY; LIBRARIES; IDENTIFICATION; MOLECULES; BIOLOGY; DESIGN; LEADS; ASSAY;
D O I
10.1186/s13321-016-0137-3
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Background: Bioassay data analysis continues to be an essential, routine, yet challenging task in modern drug discovery and chemical biology research. The challenge is to infer reliable knowledge from big and noisy data. Some aspects of this problem are general with solutions informed by existing and emerging data science best practices. Some aspects are domain specific, and rely on expertise in bioassay methodology and chemical biology. Testing compounds for biological activity requires complex and innovative methodology, producing results varying widely in accuracy, precision, and information content. Hit selection criteria involve optimizing such that the overall probability of success in a project is maximized, and resource-wasteful "false trails" are avoided. This "fail-early" approach is embraced both in pharmaceutical and academic drug discovery, since follow-up capacity is resource-limited. Thus, early identification of likely promiscuous compounds has practical value. Results: Here we describe an algorithm for identifying likely promiscuous compounds via associated scaffolds which combines general and domain-specific features to assist and accelerate drug discovery informatics, called Badapple: bioassay-data associative promiscuity pattern learning engine. Results are described from an analysis using data from MLP assays via the BioAssay Research Database (BARD) http://bard.nih.gov. Specific examples are analyzed in the context of medicinal chemistry, to illustrate associations with mechanisms of promiscuity. Badapple has been developed at UNM, released and deployed for public use two ways: (1) BARD plugin, integrated into the public BARD REST API and BARD web client; and (2) public web app hosted at UNM. Conclusions: Badapple is a method for rapidly identifying likely promiscuous compounds via associated scaffolds. Badapple generates a score associated with a pragmatic, empirical definition of promiscuity, with the overall goal to identify "false trails" and streamline workflows. Unlike methods reliant on expert curation of chemical substructure patterns, Badapple is fully evidence-driven, automated, self-improving via integration of additional data, and focused on scaffolds. Badapple is robust with respect to noise and errors, and skeptical of scanty evidence.
引用
收藏
页数:14
相关论文
共 39 条
[1]   Can we learn to distinguish between "drug-like" and "nondrug-like" molecules? [J].
Ajay ;
Walters, WP ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1998, 41 (18) :3314-3324
[2]  
[Anonymous], 2007, MLSMR EXCLUDED FUNCI
[3]  
[Anonymous], 2012, JCHEM 5 8 3 CHEMAXON
[4]  
[Anonymous], 2004, NIH MOL LIB SMALL MO
[5]  
[Anonymous], 2013, BADAPPLE PUBLIC WEBE
[6]  
[Anonymous], 2012, HSCAF REPOSITORY
[7]   The promise and peril of chemical probes [J].
Arrowsmith, Cheryl H. ;
Audia, James E. ;
Austin, Christopher ;
Baell, Jonathan ;
Bennett, Jonathan ;
Blagg, Julian ;
Bountra, Chas ;
Brennan, Paul E. ;
Brown, Peter J. ;
Bunnage, Mark E. ;
Buser-Doepner, Carolyn ;
Campbell, Robert M. ;
Carter, Adrian J. ;
Cohen, Philip ;
Copeland, Robert A. ;
Cravatt, Ben ;
Dahlin, Jayme L. ;
Dhanak, Dashyant ;
Edwards, Aled M. ;
Frye, Stephen V. ;
Gray, Nathanael ;
Grimshaw, Charles E. ;
Hepworth, David ;
Howe, Trevor ;
Huber, Kilian V. M. ;
Jin, Jian ;
Knapp, Stefan ;
Kotz, Joanne D. ;
Kruger, Ryan G. ;
Lowe, Derek ;
Mader, Mary M. ;
Marsden, Brian ;
Mueller-Fahrnow, Anke ;
Mueller, Susanne ;
O'Hagan, Ronan C. ;
Overington, John P. ;
Owen, Dafydd R. ;
Rosenberg, Saul H. ;
Roth, Brian ;
Ross, Ruth ;
Schapira, Matthieu ;
Schreiber, Stuart L. ;
Shoichet, Brian ;
Sundstrom, Michael ;
Superti-Furga, Giulio ;
Taunton, Jack ;
Toledo-Sherman, Leticia ;
Walpole, Chris ;
Walters, Michael A. ;
Willson, Timothy M. .
NATURE CHEMICAL BIOLOGY, 2015, 11 (08) :536-541
[8]   NIH Molecular Libraries Initiative [J].
Austin, CP ;
Brady, LS ;
Insel, TR ;
Collins, FS .
SCIENCE, 2004, 306 (5699) :1138-1139
[9]   New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays [J].
Baell, Jonathan B. ;
Holloway, Georgina A. .
JOURNAL OF MEDICINAL CHEMISTRY, 2010, 53 (07) :2719-2740
[10]   The properties of known drugs .1. Molecular frameworks [J].
Bemis, GW ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) :2887-2893