Combinatorial preferences affect molecular similarity/diversity calculations using binary fingerprints and Tanimoto coefficients

被引:106
作者
Godden, JW
Xue, L
Bajorath, J
机构
[1] New Chem Ent Inc, Bothell, WA 98011 USA
[2] Univ Washington, Dept Biol Struct, Seattle, WA 98195 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2000年 / 40卷 / 01期
关键词
D O I
10.1021/ci990316u
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
A combinatorial method was developed to calculate complete distributions of the Tanimoto coefficient (Tc) for binary fingerprint (FP) representations of specified length, regardless of the chemical parameters they reflect. Theoretical Tc distributions were calculated for FPs consisting of up to 67 bit positions which revealed significant statistical preferences of certain Tc values. Calculation of Tc distributions in a large compound database using different FPs mirrored the effects identified by our general analysis. On the basis of these findings, an average Tc is biased by statistically preferred values.
引用
收藏
页码:163 / 166
页数:4
相关论文
共 15 条
[1]  
[Anonymous], MOL SIMILARITY DRUG
[2]   The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01) :1-9
[3]   Computational methods in molecular diversity and combinatorial chemistry [J].
Bures, MG ;
Martin, YC .
CURRENT OPINION IN CHEMICAL BIOLOGY, 1998, 2 (03) :376-380
[4]   PATTY - A PROGRAMMABLE ATOM TYPER AND LANGUAGE FOR AUTOMATIC CLASSIFICATION OF ATOMS IN MOLECULAR DATABASES [J].
BUSH, BL ;
SHERIDAN, RP .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1993, 33 (05) :756-762
[5]  
DOWNS GM, 1997, REV COMP CH, V7, P1
[6]   On the properties of bit string-based measures of chemical similarity [J].
Flower, DR .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (03) :379-386
[7]  
James CA, 1995, Daylight theory manual. daylight chemical information systems
[8]  
Johnson M., 1990, CONCEPTS APPL MOL SI
[9]   Diversity assessment [J].
Mason, JS ;
Hermsmeier, NA .
CURRENT OPINION IN CHEMICAL BIOLOGY, 1999, 3 (03) :342-349
[10]   Clustering of large databases of compounds: Using the MDL ''keys'' as structural descriptors [J].
McGregor, MJ ;
Pallai, PV .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (03) :443-448