OptiSim: An extended dissimilarity selection method for finding diverse representative subsets

被引:121
作者
Clark, RD
机构
[1] Tripos, Inc., St. Louis, MO 63144
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 1997年 / 37卷 / 06期
关键词
molecular diversity; dissimilarity selection; clustering; combinatorial chemistry; compound selection; diversity analysis; chi square; and representative;
D O I
10.1021/ci970282v
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Compound selection methods currently available to chemists are based on maximum or minimum dissimilarity selection or on hierarchical clustering. Optimizable K-Dissimilarity Selection (OptiSim) is a novel and efficient stochastic selection algorithm which includes maximum and minimum dissimilarity-based selection as special cases. By adjusting the subsample size parameter K, it is possible to adjust the balance between representativeness and diversity in the compounds selected. The OptiSim algorithm is described, along with some analytical tools for comparing it to other selection methods. Such comparisons indicate that OptiSim can mimic the representativeness of selections based on hierarchical clustering and, at least in some cases, improve upon them.
引用
收藏
页码:1181 / 1188
页数:8
相关论文
共 18 条
[1]  
AGRAFIOTIS DK, 1996, 3 EL COMP CHEM C
[2]   CLUSTERING OF CHEMICAL STRUCTURES ON THE BASIS OF 2-DIMENSIONAL SIMILARITY MEASURES [J].
BARNARD, JM ;
DOWNS, GM .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1992, 32 (06) :644-649
[3]  
BRANNIGAN LH, 1991, PHARMACOCHEM LIBR, V16, P553
[4]   Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03) :572-584
[5]   Four association coefficients for relating molecular similarity measures [J].
Cheng, C ;
Maggiora, G ;
Lajiness, M ;
Johnson, M .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (04) :909-915
[6]  
CONOVER WJ, 1980, PRACTICAL NONPARAMET, P143
[7]   Bioisosterism as a molecular diversity descriptor: Steric fields of single ''topomeric'' conformers [J].
Cramer, RD ;
Clark, RD ;
Patterson, DE ;
Ferguson, AM .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (16) :3060-3069
[8]  
Gower J. C., 1985, Encyclopedia of statistical sciences, VVol. 5, P397
[9]   A fast algorithm for selecting sets of dissimilar molecules from large chemical databases [J].
Holliday, JD ;
Ranade, SS ;
Willett, P .
QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIPS, 1995, 14 (06) :501-506
[10]  
Holliday JD, 1996, SLAS DISCOV, V1, P145