Quantifying structure and performance diversity for sets of small molecules comprising small-molecule screening collections

被引:78
作者
Clemons, Paul A. [1 ]
Wilson, J. Anthony [1 ]
Dancik, Vlado [1 ]
Muller, Sandrine [1 ]
Carrinski, Hyman A. [1 ]
Wagner, Bridget K. [1 ]
Koehler, Angela N. [1 ]
Schreiber, Stuart L. [1 ,2 ,3 ]
机构
[1] Broad Inst Harvard & MIT, Cambridge, MA 02142 USA
[2] MIT, Howard Hughes Med Inst, Cambridge, MA 02142 USA
[3] Harvard Univ, Dept Chem & Chem Biol, Cambridge, MA 02138 USA
基金
美国国家卫生研究院;
关键词
DIFFERENTIAL SHANNON ENTROPY; STATISTICAL LEARNING-METHODS; DRUG-LIKE; NATURAL-PRODUCTS; DATA-FUSION; SKELETAL DIVERSITY; COMPOUND DATABASES; LIBRARY DESIGN; COMBINATORIAL LIBRARIES; SUBSTITUENT CONSTANTS;
D O I
10.1073/pnas.1015024108
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Using a diverse collection of small molecules we recently found that compound sets from different sources (commercial; academic; natural) have different protein-binding behaviors, and these behaviors correlate with trends in stereochemical complexity for these compound sets. These results lend insight into structural features that synthetic chemists might target when synthesizing screening collections for biological discovery. We report extensive characterization of structural properties and diversity of biological performance for these compounds and expand comparative analyses to include physicochemical properties and three-dimensional shapes of predicted conformers. The results highlight additional similarities and differences between the sets, but also the dependence of such comparisons on the choice of molecular descriptors. Using a protein-binding dataset, we introduce an information-theoretic measure to assess diversity of performance with a constraint on specificity. Rather than relying on finding individual active compounds, this measure allows rational judgment of compound subsets as groups. We also apply this measure to publicly available data from ChemBank for the same compound sets across a diverse group of functional assays. We find that performance diversity of compound sets is relatively stable across a range of property values as judged by this measure, both in protein-binding studies and functional assays. Because building screening collections with improved performance depends on efficient use of synthetic organic chemistry resources, these studies illustrate an important quantitative framework to help prioritize choices made in building such collections.
引用
收藏
页码:6817 / 6822
页数:6
相关论文
共 71 条
[51]   Stereochemical and Skeletal Diversity Arising from Amino Propargylic Alcohols [J].
Pizzirani, Daniela ;
Kaya, Taner ;
Clemons, Paul A. ;
Schreiber, Stuart L. .
ORGANIC LETTERS, 2010, 12 (12) :2822-2825
[52]   Random or rational design?: Evaluation of diverse compound subsets from chemical structure databases [J].
Pötter, T ;
Matter, H .
JOURNAL OF MEDICINAL CHEMISTRY, 1998, 41 (04) :478-488
[53]   Combination of fingerprint-based similarity coefficients using data fusion [J].
Salim, N ;
Holliday, J ;
Willett, P .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (02) :435-442
[54]   Molecular shape diversity of combinatorial libraries: A prerequisite for broad bioactivity [J].
Sauer, WHB ;
Schwarz, MK .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (03) :987-1003
[55]   Towards patient-based cancer therapeutics [J].
Schreiber, Stuart L. ;
Shamji, Alykhan F. ;
Clemons, Paul A. ;
Hon, Cindy ;
Koehler, Angela N. ;
Munoz, Benito ;
Palmer, Michelle ;
Stern, Andrew M. ;
Wagner, Bridget K. ;
Powers, Scott ;
Lowe, Scott W. ;
Guo, Xuecui ;
Krasnitz, Alex ;
Sawey, Eric T. ;
Sordella, Raffaella ;
Stein, Lincoln ;
Trotman, Lloyd C. ;
Califano, Andrea ;
Dalla-Favera, Riccardo ;
Ferrando, Adolfo ;
Iavarone, Antonio ;
Pasqualucci, Laura ;
Silva, Jose ;
Stockwell, Brent R. ;
Hahn, William C. ;
Chin, Lynda ;
DePinho, Ronald A. ;
Boehm, Jesse S. ;
Gopal, Shuba ;
Huang, Alan ;
Root, David E. ;
Weir, Barbara A. ;
Gerhard, Daniela S. ;
Zenklusen, Jean Claude ;
Roth, Michael G. ;
White, Michael A. ;
Minna, John D. ;
MacMillan, John B. ;
Posner, Bruce A. .
NATURE BIOTECHNOLOGY, 2010, 28 (09) :904-906
[56]   Relationships between molecular complexity, biological activity, and structural diversity [J].
Schuffenhauer, A ;
Brown, N ;
Selzer, P ;
Ertl, P ;
Jacoby, E .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (02) :525-535
[57]   ChemBank:: a small-molecule screening and cheminformatics resource database [J].
Seiler, Kathleen Petri ;
George, Gregory A. ;
Happ, Mary Pat ;
Bodycombe, Nicole E. ;
Carrinski, Hyman A. ;
Norton, Stephanie ;
Brudz, Steve ;
Sullivan, John P. ;
Muhlich, Jeremy ;
Serrano, Martin ;
Ferraiolo, Paul ;
Tolliday, Nicola J. ;
Schreiber, Stuart L. ;
Clemons, Paul A. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D351-D359
[58]   Complex molecules: do they add value? [J].
Selzer, P ;
Roth, HM ;
Ertl, P ;
Schuffenhauer, A .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2005, 9 (03) :310-316
[59]  
Shannon CE, 1997, M D COMPUT, V14, P306
[60]  
SHESHKIN DJ, 2004, HDB PARAMETRIC NONPA, P1016