Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor

被引:119
作者
Faulon, Jean-Loup [1 ]
Misra, Milind [1 ]
Martin, Shawn [2 ]
Sale, Ken [3 ]
Sapra, Rajat [3 ]
机构
[1] Sandia Natl Labs, Computat Biosci Dept, Albuquerque, NM 87185 USA
[2] Sandia Natl Labs, Dept Informat & Comp Sci, Albuquerque, NM 87185 USA
[3] Sandia Natl Labs, Livermore, CA 94551 USA
关键词
D O I
10.1093/bioinformatics/btm580
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. There is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer proteinchemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformatics representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.
引用
收藏
页码:225 / 233
页数:9
相关论文
共 35 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[3]   NIH Molecular Libraries Initiative [J].
Austin, CP ;
Brady, LS ;
Insel, TR ;
Collins, FS .
SCIENCE, 2004, 306 (5699) :1138-1139
[4]   Kernel methods for predicting protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BIOINFORMATICS, 2005, 21 :I38-I46
[5]   Similarity searching of chemical databases using atom environment descriptors (MOLPRINT 2D): Evaluation of performance [J].
Bender, A ;
Mussa, HY ;
Glen, RC ;
Reiling, S .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2004, 44 (05) :1708-1718
[6]   Predicting protein-protein interactions from primary structure [J].
Bock, JR ;
Gough, DA .
BIOINFORMATICS, 2001, 17 (05) :455-460
[7]   Protein function prediction via graph kernels [J].
Borgwardt, KM ;
Ong, CS ;
Schönauer, S ;
Vishwanathan, SVN ;
Smola, AJ ;
Kriegel, HP .
BIOINFORMATICS, 2005, 21 :I47-I56
[8]   The European Bioinformatics Institute's data resources: towards systems biology [J].
Brooksbank, C ;
Cameron, G ;
Thornton, J .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D46-D53
[9]   SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence [J].
Cai, CZ ;
Han, LY ;
Ji, ZL ;
Chen, X ;
Chen, YZ .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3692-3697
[10]   The signature molecular descriptor - 3. Inverse-quantitative structure-activity relationship of ICAM-1 inhibitory peptides [J].
Churchwell, CJ ;
Rintoul, MD ;
Martin, S ;
Visco, DP ;
Kotu, A ;
Larson, RS ;
Sillerud, LO ;
Brown, DC ;
Faulon, JL .
JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2004, 22 (04) :263-273