Discovering novel subsystems using comparative genomics

被引:5
作者
Ferrer, Luciana [1 ]
Shearer, Alexander G. [1 ]
Karp, Peter D. [1 ]
机构
[1] SRI Int, Ctr Artificial Intelligence, Menlo Pk, CA 94025 USA
关键词
PROTEIN-PROTEIN INTERACTIONS; ESCHERICHIA-COLI; MOLYBDOPTERIN SYNTHASE; BIOCYC COLLECTION; CELL-DIVISION; DCW CLUSTER; DATABASE; PATHWAY; OPERON; GENE;
D O I
10.1093/bioinformatics/btr428
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Key problems for computational genomics include discovering novel pathways in genome data, and discovering functional interaction partners for genes to define new members of partially elucidated pathways. Results: We propose a novel method for the discovery of subsystems from annotated genomes. For each gene pair, a score measuring the likelihood that the two genes belong to a same subsystem is computed using genome context methods. Genes are then grouped based on these scores, and the resulting groups are filtered to keep only high-confidence groups. Since the method is based on genome context analysis, it relies solely on structural annotation of the genomes. The method can be used to discover new pathways, find missing genes from a known pathway, find new protein complexes or other kinds of functional groups and assign function to genes. We tested the accuracy of our method in Escherichia coli K-12. In one configuration of the system, we find that 31.6% of the candidate groups generated by our method match a known pathway or protein complex closely, and that we rediscover 31.2% of all known pathways and protein complexes of at least 4 genes. We believe that a significant proportion of the candidates that do not match any known group in E.coli K-12 corresponds to novel subsystems that may represent promising leads for future laboratory research. We discuss in-depth examples of these findings.
引用
收藏
页码:2478 / 2485
页数:8
相关论文
共 34 条
[1]   MraZ from Escherichia coli:: cloning, purification, crystallization and preliminary X-ray analysis [J].
Adams, MA ;
Udell, CM ;
Pal, GP ;
Jia, ZC .
ACTA CRYSTALLOGRAPHICA SECTION F-STRUCTURAL BIOLOGY COMMUNICATIONS, 2005, 61 :378-380
[2]   A 12-cistron Escherichia coli operon (hyf) encoding a putative proton-translocating formate hydrogenlyase system [J].
Andrews, SC ;
Berks, BC ;
McClay, J ;
Ambler, A ;
Quail, MA ;
Golby, P ;
Guest, JR .
MICROBIOLOGY-SGM, 1997, 143 :3633-3647
[3]  
Bagramyan K, 2001, Membr Cell Biol, V14, P749
[4]   NUCLEOTIDE-SEQUENCE AND EXPRESSION OF AN OPERON IN ESCHERICHIA-COLI CODING FOR FORMATE HYDROGENYLASE COMPONENTS [J].
BOHM, R ;
SAUTER, M ;
BOCK, A .
MOLECULAR MICROBIOLOGY, 1990, 4 (02) :231-243
[5]   Prolinks: a database of protein functional linkages derived from coevolution [J].
Bowers, PM ;
Pellegrini, M ;
Thompson, MJ ;
Fierro, J ;
Yeates, TO ;
Eisenberg, D .
GENOME BIOLOGY, 2004, 5 (05)
[6]   FINDING ALL CLIQUES OF AN UNDIRECTED GRAPH [H] [J].
BRON, C ;
KERBOSCH, J .
COMMUNICATIONS OF THE ACM, 1973, 16 (09) :575-577
[7]   Mining biological networks for unknown pathways [J].
Cakmak, Ali ;
Ozsoyoglu, Gultekin .
BIOINFORMATICS, 2007, 23 (20) :2775-2783
[8]   mraW, an essential gene at the dcw cluster of Escherichia coli codes for a cytoplasmic protein with methyltransferase activity [J].
Carrión, M ;
Gómez, MJ ;
Merchante-Schubert, R ;
Dongarrá, S ;
Ayala, JA .
BIOCHIMIE, 1999, 81 (8-9) :879-888
[9]  
Caspi R, 2008, NUCLEIC ACIDS RES, V36, pD623, DOI [10.1093/nar/gkm900, 10.1093/nar/gkt1103]
[10]   Machine learning methods for metabolic pathway prediction [J].
Dale, Joseph M. ;
Popescu, Liviu ;
Karp, Peter D. .
BMC BIOINFORMATICS, 2010, 11