Combination of Similarity Rankings Using Data Fusion

被引:112
作者
Willett, Peter [1 ]
机构
[1] Univ Sheffield, Informat Sch, Sheffield S1 4DP, S Yorkshire, England
关键词
MOLECULAR SIMILARITY; POWER-LAWS; REPRESENTATIONS; DESCRIPTORS; DOCKING; PROBABILITY; ALGORITHM; SEARCHES; FILES; SHAPE;
D O I
10.1021/ci300547g
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The use of data fusion in similarity-based virtual screening is studied. Data fusion is the name given to a body of techniques that combine multiple sources of data into a single source, with the expectation that the resulting fused source will be more informative than will the individual input sources. The scores that are merged by the fusion rule can be of two types, either the structure's actual similarity, as computed using some particular similarity measure; or the rank of the structure when all of the N computed similarities are ranked in decreasing order of the scores for the chosen similarity measure. It first found application in similarity fusion, where a single reference structure is searched using different similarity measures; it has since been extended to encompass multiple reference structures. Both approaches can benefit from the availability of training data linking similarity scores and probabilities of activity, but unsupervised fusion rules are available that enables effective searches to be carried out even in the absence of such data.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 89 条
[51]   Machine Learning in Virtual Screening [J].
Melville, James L. ;
Burke, Edmund K. ;
Hirst, Jonathan D. .
COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2009, 12 (04) :332-343
[52]   Application of belief theory to similarity data fusion for use in analog searching and lead hopping [J].
Muchmore, Steven W. ;
Debe, Derek A. ;
Metz, James T. ;
Brown, Scott P. ;
Martin, Yvonne C. ;
Hajduk, Philip J. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2008, 48 (05) :941-948
[53]   Large scale study of multiple-molecule queries [J].
Nasr, Ramzi J. ;
Swamidass, S. Joshua ;
Baldi, Pierre F. .
JOURNAL OF CHEMINFORMATICS, 2009, 1
[54]   Power laws, Pareto distributions and Zipf's law [J].
Newman, MEJ .
CONTEMPORARY PHYSICS, 2005, 46 (05) :323-351
[55]   De Novo Drug Design Using Multiobjective Evolutionary Graphs [J].
Nicolaou, Christos A. ;
Apostolakis, Joannis ;
Pattichis, Costas S. .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2009, 49 (02) :295-307
[56]  
PAO ML, 1986, J AM SOC INFORM SCI, V37, P26, DOI 10.1002/(SICI)1097-4571(198601)37:1<26::AID-ASI4>3.0.CO
[57]  
2-Z
[58]   Conditional probability: A new fusion method for merging disparate virtual screening results [J].
Raymond, JW ;
Jalaie, M ;
Bradley, MP .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (02) :601-609
[59]   State-of-the-art in ligand-based virtual screening [J].
Ripphausen, Peter ;
Nisius, Britta ;
Bajorath, Juergen .
DRUG DISCOVERY TODAY, 2011, 16 (9-10) :372-376
[60]   Extended-Connectivity Fingerprints [J].
Rogers, David ;
Hahn, Mathew .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2010, 50 (05) :742-754