Rational combinatorial library design. 3. Simulated annealing guided evaluation (SAGE) of molecular diversity: A novel computational tool for universal library design and database mining

被引:43
作者
Zheng, WF
Cho, SJ
Waller, CL
Tropsha, A [1 ]
机构
[1] Univ N Carolina, Sch Pharm, Div Med Chem, Lab Mol Modeling, Chapel Hill, NC 27599 USA
[2] OSI Pharmaceut Inc, Durham, NC 27707 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1999年 / 39卷 / 04期
关键词
D O I
10.1021/ci980103p
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
We have developed a novel method for molecular diversity sampling called SAGE (simulated annealing guided evaluation of molecular diversity). Compounds in chemical databases or virtual combinatorial libraries are conventionally represented as points in multidimensional descriptor space. The SAGE algorithm selects a desired number of optimally diverse points (compounds) from a database. The diversity of a subset of points is measured by a specially designed diversity function, and the most diverse subset is selected using Simulated Annealing (SA) as the optimization tool. Application of SAGE to two simulated data sets of randomly distributed points in two-dimensional space afforded diverse and representative selection as judged by visual inspection. SAGE was also applied, in comparison with random sampling, to two other simulated data sets with points distributed among many clusters. We found that SAGE sampling covered significantly more clusters than the random sampling. By defining a fraction of data points as active, we also compared SAGE with random sampling in terms of hit rates. We showed that when the percentage of active points was low, the hit rates obtained by SAGE were always higher than those obtained by random sampling. When the percentage of active points was high, the performance of SAGE, in terms of individual hit rates, depended upon the data structure. However, in all cases, SAGE performed better than random sampling when cluster hit rates were used as the criterion.
引用
收藏
页码:738 / 746
页数:9
相关论文
共 59 条
[1]   Stochastic algorithms for maximizing molecular diversity [J].
Agrafiotis, DK .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (05) :841-851
[2]   CLUSTERING OF CHEMICAL STRUCTURES ON THE BASIS OF 2-DIMENSIONAL SIMILARITY MEASURES [J].
BARNARD, JM ;
DOWNS, GM .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (06) :644-649
[3]  
BAUKNECHT H, 1996, J CHEM INF COMP SCI, V36, P205
[4]   The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01) :1-9
[5]   Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03) :572-584
[6]   Designing combinatorial library mixtures using a genetic algorithm [J].
Brown, RD ;
Martin, YC .
JOURNAL OF MEDICINAL CHEMISTRY, 1997, 40 (15) :2304-2313
[7]   ATOM PAIRS AS MOLECULAR-FEATURES IN STRUCTURE ACTIVITY STUDIES - DEFINITION AND APPLICATIONS [J].
CARHART, RE ;
SMITH, DH ;
VENKATARAGHAVAN, R .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1985, 25 (02) :64-73
[8]   Rational combinatorial library design. 2. Rational design of targeted combinatorial peptide libraries using chemical similarity probe and the inverse QSAR approaches [J].
Cho, SJ ;
Zheng, WF ;
Tropsha, A .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (02) :259-268
[9]   Active-site-directed 3D database searching: Pharmacophore extraction and validation of hits [J].
Clark, DE ;
Westhead, DR ;
Sykes, RA ;
Murray, CW .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1996, 10 (05) :397-416
[10]   Molecular diversity in chemical databases: Comparison of medicinal chemistry knowledge bases and databases of commercially available compounds [J].
Cummins, DJ ;
Andrews, CW ;
Bentley, JA ;
Cory, M .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (04) :750-763