Semantic Similarity for Automatic Classification of Chemical Compounds

被引:27
作者
Ferreira, Joao D. [1 ]
Couto, Francisco M. [1 ]
机构
[1] Univ Lisbon, LaSIGE, P-1699 Lisbon, Portugal
关键词
P-GLYCOPROTEIN; FLAVONOIDS; PREDICTION; MOLECULES; ESTROGEN; DOCKING; FOREST; BRAIN; DRUGS; CELLS;
D O I
10.1371/journal.pcbi.1000937
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
With the increasing amount of data made available in the chemical field, there is a strong need for systems capable of comparing and classifying chemical compounds in an efficient and effective way. The best approaches existing today are based on the structure-activity relationship premise, which states that biological activity of a molecule is strongly related to its structural or physicochemical properties. This work presents a novel approach to the automatic classification of chemical compounds by integrating semantic similarity with existing structural comparison methods. Our approach was assessed based on the Matthews Correlation Coefficient for the prediction, and achieved values of 0.810 when used as a prediction of blood-brain barrier permeability, 0.694 for P-glycoprotein substrate, and 0.673 for estrogen receptor binding activity. These results expose a significant improvement over the currently existing methods, whose best performances were 0.628, 0.591, and 0.647 respectively. It was demonstrated that the integration of semantic similarity is a feasible and effective way to improve existing chemical compound classification systems. Among other possible uses, this tool helps the study of the evolution of metabolic pathways, the study of the correlation of metabolic networks with properties of those networks, or the improvement of ontologies that represent chemical information.
引用
收藏
页数:11
相关论文
共 52 条
[1]  
AMSTERDAM JD, 1986, PSYCHOPHARMACOLOGY, V88, P484
[2]  
[Anonymous], The Open Babel Package version 2.3.1
[3]   Assessing the accuracy of prediction algorithms for classification: an overview [J].
Baldi, P ;
Brunak, S ;
Chauvin, Y ;
Andersen, CAF ;
Nielsen, H .
BIOINFORMATICS, 2000, 16 (05) :412-424
[4]   The properties of known drugs .1. Molecular frameworks [J].
Bemis, GW ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1996, 39 (15) :2887-2893
[5]   Properties of known drugs. 2. Side chains [J].
Bemis, GW ;
Murcko, MA .
JOURNAL OF MEDICINAL CHEMISTRY, 1999, 42 (25) :5095-5099
[6]   What are ontologies, and why do we need them? [J].
Chandrasekaran, B ;
Josephson, JR ;
Benjamins, VR .
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (01) :20-26
[7]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[8]  
DEMATOS P, 2009, NUCLEIC ACIDS RES, pD249
[9]   Molecular Docking Algorithms [J].
Dias, Raquel ;
de Azevedo, Walter Filgueira, Jr. .
CURRENT DRUG TARGETS, 2008, 9 (12) :1040-1047
[10]   Predicting CNS permeability of drug molecules: comparison of neural network and support vector machine algorithms [J].
Doniger, S ;
Hofmann, T ;
Yeh, J .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (06) :849-864