Distinguishing between natural products and synthetic molecules by descriptor Shannon entropy analysis and binary QSAR calculations

被引:83
作者
Stahura, FL
Godden, JW
Xue, L
Bajorath, J
机构
[1] New Chem Entitles Inc, Comp Aided Drug Discovery, Bothell, WA 98011 USA
[2] Univ Washington, Dept Biol Struct, Seattle, WA 98195 USA
来源
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES | 2000年 / 40卷 / 05期
关键词
D O I
10.1021/ci0003303
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Molecular descriptors were identified by Shannon entropy analysis that correctly distinguished in binary QSAR calculations, between naturally occurring molecules and synthetic compounds. The Shannon entropy concept was first used in digital communication theory and has only very recently been applied to descriptor analysis. Binary QSAR methodology was originally developed to correlate structural features and properties of compounds with a binary formulation of biological activity (i.e., active or inactive) and has here been adapted to correlate molecular features with chemical source (i.e., natural or synthetic). We have identified a number of molecular descriptors with significantly different shannon entropy and/or "entropic separation" in natural and synthetic compound;databases, Different combinations of such descriptions and variably distributed structural keys were applied to learning sets consisting of natural and synthetic molecules;and used to derive predictive binary QSAR models. These models were then applied to. predict the source of compounds in different test sets consisting of randomly collected natural and synthetic molecules, gr alternatively, sets of natural and synthetic molecules with specific biological activities. On average, greater than 80% prediction accuracy was achieved with our best models. For the test case consisting of molecules with specific activities, greater than 90% accuracy was achieved. From our analysis, some chemical features were identified that systematically differ in many naturally occurring versus synthetic molecules.
引用
收藏
页码:1245 / 1252
页数:8
相关论文
共 24 条
[1]   Can we learn to distinguish between "drug-like" and "nondrug-like" molecules? [J].
Ajay ;
Walters, WP ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1998, 41 (18) :3314-3324
[2]  
[Anonymous], MOE MOL OPERATING EN
[3]  
[Anonymous], MACCS KEYS
[4]   The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1997, 37 (01) :1-9
[5]  
Brown RD, 1997, PERSPECT DRUG DISCOV, V7-8, P31
[6]  
Dunn III W. J., 1989, TETRAHEDRON COMPUT M, V2, P349, DOI DOI 10.1016/0898-5529(89)90004-3
[7]  
Feller W, 1950, An Introduction to Probability Theory and Its Applications, VI
[8]   Binary quantitative structure-activity relationship (QSAR) analysis of estrogen receptor ligands [J].
Gao, H ;
Williams, C ;
Labute, P ;
Bajorath, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (01) :164-168
[9]   Comparison of binary and 2D QSAR analyses using inhibitors of human carbonic anhydrase II as a test case [J].
Gao, H ;
Bajorath, J .
MOLECULAR DIVERSITY, 1998, 4 (02) :115-130
[10]   ITERATIVE PARTIAL EQUALIZATION OF ORBITAL ELECTRONEGATIVITY - A RAPID ACCESS TO ATOMIC CHARGES [J].
GASTEIGER, J ;
MARSILI, M .
TETRAHEDRON, 1980, 36 (22) :3219-3228