Annotation-based inference of transporter function

被引:34
作者
Lee, Thomas J. [1 ]
Paulsen, Ian [2 ]
Karp, Peter [1 ]
机构
[1] SRI Int, Ctr Artificial Intelligence, Menlo Pk, CA 94025 USA
[2] Macquarie Univ, Dept Chem & Biomol Sci, Sydney, NSW 2109, Australia
关键词
D O I
10.1093/bioinformatics/btn180
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: We present a method for inferring and constructing transport reactions for transporter proteins based primarily on the analysis of the names of individual proteins in the genome annotation of an organism. Transport reactions are declarative descriptions of transporter activities, and thus can be manipulated computationally, unlike free-text protein names. Once transporter activities are encoded as transport reactions, a number of computational analyses are possible including database queries by transporter activity; inclusion of transporters into an automatically generated metabolic-map diagram that can be painted with omics data to aid in their interpretation; detection of anomalies in the metabolic and transport networks, such as substrates that are transported into the cell but are not inputs to any metabolic reaction or pathway; and comparative analyses of the transport capabilities of different organisms. Results: On randomly selected organisms, the method achieves precision and recall rates of 0.93 and 0.90, respectively in identifying transporter proteins by name within the complete genome. The method obtains 67.5% accuracy in predicting complete transport reactions; if allowance is made for predictions that are overly general yet not incorrect, reaction prediction accuracy is 82.5.
引用
收藏
页码:I259 / I267
页数:9
相关论文
共 19 条
[1]   The Mouse Genome Database (MGD): mouse biology and model systems [J].
Bult, Carol J. ;
Eppig, Janan T. ;
Kadin, James A. ;
Richardson, Joel E. ;
Blake, Judith A. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D724-D728
[2]  
Caspi R, 2008, NUCLEIC ACIDS RES, V36, pD623, DOI [10.1093/nar/gkm900, 10.1093/nar/gkt1103]
[3]   MetaCyc: a multiorganism database of metabolic pathways and enzymes [J].
Caspi, Ron ;
Foerster, Hartmut ;
Fulcher, Carol A. ;
Hopkinson, Rebecca ;
Ingraham, John ;
Kaipa, Pallavi ;
Krummenacker, Markus ;
Paley, Suzanne ;
Pick, John ;
Rhee, Seung Y. ;
Tissier, Christophe ;
Zhang, Peifen ;
Karp, Peter D. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D511-D516
[4]   dictyBase, the model organism database for Dictyostelium discoideum [J].
Chisholm, Rex L. ;
Gaudet, Pascale ;
Just, Eric M. ;
Pilcher, Karen E. ;
Fey, Petra ;
Merchant, Sohel N. ;
Kibbe, Warren A. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D423-D427
[5]   A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information [J].
Feist, Adam M. ;
Henry, Christopher S. ;
Reed, Jennifer L. ;
Krummenacker, Markus ;
Joyce, Andrew R. ;
Karp, Peter D. ;
Broadbelt, Linda J. ;
Hatzimanikatis, Vassily ;
Palsson, Bernhard O. .
MOLECULAR SYSTEMS BIOLOGY, 2007, 3
[6]  
Harris MA, 2008, NUCLEIC ACIDS RES, V36, pD440, DOI 10.1093/nar/gkm883
[7]   Expansion of the BioCyc collection of pathway/genome databases to 160 genomes [J].
Karp, PD ;
Ouzounis, CA ;
Moore-Kochlacs, C ;
Goldovsky, L ;
Kaipa, P ;
Ahrén, D ;
Tsoka, S ;
Darzentas, N ;
Kunin, V ;
López-Bigas, N .
NUCLEIC ACIDS RESEARCH, 2005, 33 (19) :6083-6089
[8]   Pathway databases: A case study in computational symbolic theories [J].
Karp, PD .
SCIENCE, 2001, 293 (5537) :2040-2044
[9]   Multidimensional annotation of the Escherichia coli K-12 genome [J].
Karp, Peter D. ;
Keseler, Ingrid M. ;
Shearer, Alexander ;
Latendresse, Mario ;
Krummenacker, Markus ;
Paley, Suzanne M. ;
Paulsen, Ian ;
Collado-Vides, Julio ;
Gama-Castro, Socorro ;
Peralta-Gil, Martin ;
Santos-Zavaleta, Alberto ;
Penaloza-Spinola, Monica I. ;
Bonavides-Martinez, Cesar ;
Ingraham, John .
NUCLEIC ACIDS RESEARCH, 2007, 35 (22) :7577-7590
[10]  
Karp Peter D, 2002, Bioinformatics, V18 Suppl 1, pS225