Training similarity measures for specific activities: Application to reduced graphs

被引:23
作者
Birchall, K
Gillet, VJ
Harper, G
Pickett, SD
机构
[1] Univ Sheffield, Dept Informat Studies, Sheffield S10 2TN, S Yorkshire, England
[2] GlaxoSmithKline Inc, Stevenage SG1 2NY, Herts, England
关键词
D O I
10.1021/ci050465e
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Reduced graph representations of chemical structures have been shown to be effective in similarity searching applications where they offer comparable performance to other 2D descriptors in terms of recall experiments. They have also been shown to complement existing descriptors and to offer potential to scaffold hop from one chemical series to another. Various methods have been developed for quantifying the similarity between reduced graphs including fingerprint approaches, graph matching, and an edit distance method. The edit distance approach quantifies the degree of similarity of two reduced graphs based on the number and type of operations required to convert one graph to the other. An attractive feature of the edit distance method is the ability to assign different weights to different operations. For example, the mutation of an aromatic ring node to an acyclic node may be assigned a higher weight than the mutation of an aromatic ring to an aliphatic ring node. In this paper, we describe a genetic algorithm (GA) for training the weights of the different edit distance operations. The method is applied to specific activity classes extracted from the MDDR database to derive activity-class specific weights. The GA-derived weights give substantially improved results in recall experiments as compared to using weights assigned oil intuition. Furthermore, such activity specific weights may provide useful structure-activity information for subsequent design efforts. In a virtual screening setting when few active compounds are known, it may be more useful to have weights that perform well across a variety of different activity classes. Thus, the GA is also trained on multiple activity classes simultaneously to derive a generalized set of weights. These more generally applicable weights also represent a substantial improvement on previous work.
引用
收藏
页码:577 / 586
页数:10
相关论文
共 24 条
[1]  
[Anonymous], 1997, ALGORITHMS STRINGS T, DOI DOI 10.1017/CBO9780511574931
[2]   Scaffold hopping using clique detection applied to reduced graphs [J].
Barker, EJ ;
Buttar, D ;
Cosgrove, DA ;
Gardiner, EJ ;
Kitts, P ;
Willett, P ;
Gillet, VJ .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2006, 46 (02) :503-511
[3]   Further development of reduced graphs for identifying bioactive compounds [J].
Barker, EJ ;
Gardiner, EJ ;
Gillet, VJ ;
Kitts, P ;
Morris, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (02) :346-356
[4]   Molecular similarity: a key technique in molecular informatics [J].
Bender, A ;
Glen, RC .
ORGANIC & BIOMOLECULAR CHEMISTRY, 2004, 2 (22) :3204-3218
[5]  
Bohm Hans-Joachim, 2004, Drug Discov Today Technol, V1, P217, DOI 10.1016/j.ddtec.2004.10.009
[6]   Similarity searching using reduced graphs [J].
Gillet, VJ ;
Willett, P ;
Bradshaw, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (02) :338-345
[7]   Identification of biological activity profiles using substructural analysis and genetic algorithms [J].
Gillet, VJ ;
Willett, P ;
Bradshaw, J .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1998, 38 (02) :165-179
[8]   The reduced graph descriptor in virtual screening and data-driven clustering of high-throughput screening data [J].
Harper, G ;
Bravi, GS ;
Pickett, SD ;
Hussain, J ;
Green, DVS .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (06) :2145-2156
[9]   Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures [J].
Hert, J ;
Willett, P ;
Wilton, DJ .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (03) :1177-1185
[10]   Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures [J].
Hert, J ;
Willett, P ;
Wilton, DJ ;
Acklin, P ;
Azzaoui, K ;
Jacoby, E ;
Schuffenhauer, A .
ORGANIC & BIOMOLECULAR CHEMISTRY, 2004, 2 (22) :3256-3266