Identifying metabolic enzymes with multiple types of association evidence

被引:70
作者
Kharchenko, P
Chen, LF
Freund, Y
Vitkup, D
Church, GM
机构
[1] Department of Genetics, Harvard Medical School, Boston, MA 02115
[2] Center for Computational Biology and Bioinformatics, Department of Biomedical Informatics, Columbia University, New York, NY 10032
[3] Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093
关键词
D O I
10.1186/1471-2105-7-177
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results: We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion: We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities.
引用
收藏
页数:16
相关论文
共 61 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Predicting protein complex membership using probabilistic network reliability [J].
Asthana, S ;
King, OD ;
Gibbons, FD ;
Roth, FP .
GENOME RESEARCH, 2004, 14 (06) :1170-1175
[3]   Similarities and differences in genome-wide expression data of six organisms [J].
Bergmann, S ;
Ihmels, J ;
Barkai, N .
PLOS BIOLOGY, 2004, 2 (01) :85-93
[4]   Identification of the tRNA-dihydrouridine synthase family [J].
Bishop, AC ;
Xu, JM ;
Johnson, RC ;
Schimmel, P ;
de Crécy-Lagard, V .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (28) :25090-25095
[5]   Identification of the human methylmalonyl-CoA racemase gene based on the analysis of prokaryotic gene arrangements - Implications for decoding the human genome [J].
Bobik, TA ;
Rasche, ME .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2001, 276 (40) :37194-37198
[6]   Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism [J].
Borodina, I ;
Krabben, P ;
Nielsen, J .
GENOME RESEARCH, 2005, 15 (06) :820-829
[7]   Prolinks: a database of protein functional linkages derived from coevolution [J].
Bowers, PM ;
Pellegrini, M ;
Thompson, MJ ;
Fierro, J ;
Yeates, TO ;
Eisenberg, D .
GENOME BIOLOGY, 2004, 5 (05)
[8]   Microbial genomes and "missing" enzymes: redefining biochemical pathways [J].
Cordwell, SJ .
ARCHIVES OF MICROBIOLOGY, 1999, 172 (05) :269-279
[9]  
DUDLEY AM, 2005, NATURE MOL SYSTEMS B, DOI DOI 10.1038/MSB4100004
[10]   Tests for gene clustering [J].
Durand, D ;
Sankoff, D .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (3-4) :453-482