Mini-fingerprints for virtual screening: Design principles and generation of novel prototypes based on information theory

被引:24
作者
Xue, L
Godden, JW
Bajorath, J
机构
[1] Albany Mol Res Inc, Dept Comp Aided Drug Discovery, Bothell Res Ctr, Bothell, WA 98011 USA
[2] Univ Washington, Dept Biol Struct, Seattle, WA 98195 USA
关键词
biological activity; fingerprint design; information theory; molecular descriptors; molecular similarity; virtual screening;
D O I
10.1080/1062936021000058764
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Binary fingerprint representations of molecular structure and properties are convenient computational tools for similarity searching in compound databases and virtual screening (VS). We are investigating the design of relatively simple fingerprints for the identification of molecules having similar biological activity and recognition of remote similarity relationships. Since our designs are considerably shorter than other fingerprints used in VS, we have previously termed them "mini-fingerprints" (MFPs). A key aspect of the design strategy is the identification of suitable molecular descriptors. Whereas our initial fingerprint designs have relied on descriptor combinations that performed well in compound classification according to biological activity, second generation MFPs encode combinations of descriptors with high information content in large compound databases and high frequency of occurrence in drug-like molecules. Thus, the design of these new fingerprints does not depend on the analysis of specific classes of bioactive compounds, but rather on descriptor information content in large compound databases. Systematic evaluation of fingerprint performance in VS test calculations demonstrates that these new prototypes perform better than previously generated MFPs. The analysis described herein provides an example for the development of search tools for VS.
引用
收藏
页码:27 / 40
页数:14
相关论文
共 28 条
[11]  
Johnson M., 1990, CONCEPTS APPL MOL SI
[12]   STRUCTURE-BASED STRATEGIES FOR DRUG DESIGN AND DISCOVERY [J].
KUNTZ, ID .
SCIENCE, 1992, 257 (5073) :1078-1082
[13]   The characterization of chemical structures using molecular properties. A survey [J].
Livingstone, DJ .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (02) :195-209
[14]  
Mason J S, 2000, Pac Symp Biocomput, P576
[15]   Clustering of large databases of compounds: Using the MDL ''keys'' as structural descriptors [J].
McGregor, MJ ;
Pallai, PV .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (03) :443-448
[16]  
Smith A, 2002, NATURE, V418, P453, DOI 10.1038/418453b
[17]   Differential Shannon entropy analysis identifies molecular property descriptors that predict aqueous solubility of synthetic compounds with high accuracy in binary QSAR calculations [J].
Stahura, FL ;
Godden, JW ;
Bajorath, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2002, 42 (03) :550-558
[18]   Virtual screening - an overview [J].
Walters, WP ;
Stahl, MT ;
Murcko, MA .
DRUG DISCOVERY TODAY, 1998, 3 (04) :160-178
[19]  
Weaver W., 1963, MATH THEORY COMMUNIC
[20]   Chemical similarity searching [J].
Willett, P ;
Barnard, JM ;
Downs, GM .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (06) :983-996