Feature trees: A new molecular similarity measure based on tree matching

被引:230
作者
Rarey, M
Dixon, JS
机构
[1] SmithKline Beecham Pharmaceut, Phys & Struct Chem, King Of Prussia, PA 19406 USA
[2] German Natl Res Ctr Informat Technol, GMD, Inst Algorithms & Sci Comp, SCAI, D-53754 St Augustin, Germany
关键词
database screening; molecular descriptors; molecular similarity; molecular superposition; structural alignment;
D O I
10.1023/A:1008068904628
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this paper we present a new method for evaluating molecular similarity between small organic compounds. Instead of a linear representation like fingerprints, a more complex description, a feature tree, is calculated for a molecule. A feature tree represents hydrophobic fragments and functional groups of the molecule and the way these groups are linked together. Each node in the tree is labeled with a set of features representing chemical properties of the part of the molecule corresponding to the node. The comparison of feature trees is based on matching subtrees of two feature trees onto each other. Two algorithms for tackling the matching problem are described throughout this paper. On a dataset of about 1000 molecules, we demonstrate the ability of our approach to identify molecules belonging to the same class of inhibitors. With a second dataset of 58 molecules with known binding modes taken from the Brookhaven Protein Data Bank, we show that the matchings produced by our algorithms are compatible with the relative orientation of the molecules in the active site in 61% of the test cases. The average computation time for a pair comparison is about 50 ms on a current workstation.
引用
收藏
页码:471 / 490
页数:20
相关论文
共 25 条
[1]  
[Anonymous], MACCS 2
[2]   SIMILARITY SEARCHING IN FILES OF 3-DIMENSIONAL CHEMICAL STRUCTURES - COMPARISON OF FRAGMENT-BASED MEASURES OF SHAPE SIMILARITY [J].
BATH, PA ;
POIRRETTE, AR ;
WILLETT, P ;
ALLEN, FH .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1994, 34 (01) :141-147
[3]   A FAST AND EFFICIENT METHOD FOR 2D AND 3D MOLECULAR SHAPE-DESCRIPTION [J].
BEMIS, GW ;
KUNTZ, ID .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1992, 6 (06) :607-628
[4]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[5]   Molecular similarity based on DOCK-generated fingerprints [J].
Briem, H ;
Kuntz, ID .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (17) :3401-3408
[6]   Use of structure Activity data to compare structure-based clustering methods and descriptors for use in compound selection [J].
Brown, RD ;
Martin, YC .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1996, 36 (03) :572-584
[7]  
Corman T., 1990, INTRO ALGORITHMS
[8]  
*DAYLIGHT INC, 1994, DAYLIGHT SOFTW MAN
[9]   COMPUTER-STORAGE AND RETRIEVAL OF GENERIC CHEMICAL STRUCTURES IN PATENTS .13. REDUCED GRAPH GENERATION [J].
GILLET, VJ ;
DOWNS, GM ;
HOLLIDAY, JD ;
LYNCH, MF ;
DETHLEFSEN, W .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1991, 31 (02) :260-270
[10]  
Goede A, 1997, J COMPUT CHEM, V18, P1113, DOI 10.1002/(SICI)1096-987X(19970715)18:9<1113::AID-JCC1>3.0.CO