SNAPping up functionally related genes based on context information: A colinearity-free approach

被引:32
作者
Kolesov, G [1 ]
Mewes, HW [1 ]
Frishman, D [1 ]
机构
[1] GSF, Natl Res Ctr Environm & Hlth, Inst Bioinformat, D-85764 Neueherberg, Germany
关键词
genome analysis; gene function prediction; gene cluster; functional coupling; metabolic pathway;
D O I
10.1006/jmbi.2001.4701
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a computational approach for finding genes that are functionally related but do not possess any noticeable sequence similarity. Our method, which we call SNAP (similarity-neighborhood approach), reveals the conservation of gene order on bacterial chromosomes based on both cross-genome comparison and context information. The novel feature of this method is that it does not rely on detection of conserved colinear gene strings. Instead, we introduce the notion of a similarity-neighborhood graph (SN-graph), which is constructed from the chains of similarity and neighborhood relationships between orthologous genes in different genomes and adjacent genes in the same genome, respectively. An SN-cycle is defined as a closed path on the SN-graph and is postulated to preferentially join functionally related gene products that participate in the same biochemical or regulatory process. We demonstrate the substantial non-randomness and functional significance of SN-cycles derived from real genome data and estimate the prediction accuracy of SNAP in assigning broad function to uncharacterized proteins. Examples of practical application of SNAP for improving the quality of genome annotation are described. (C) 2001 Academic Press.
引用
收藏
页码:639 / 656
页数:18
相关论文
共 28 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Automated extraction of information in molecular biology [J].
Andrade, MA ;
Bork, P .
FEBS LETTERS, 2000, 476 (1-2) :12-17
[3]  
Bansal AK, 1999, BIOINFORMATICS, V15, P900
[4]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[5]   Gene structure and organization in Caenorhabditis elegans [J].
Blumenthal, T ;
Spieth, J .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 1996, 6 (06) :692-698
[6]  
CRAVEN M, 2000, ISMB, V8, P116
[7]   Conservation of gene order: a fingerprint of proteins that physically interact [J].
Dandekar, T ;
Snel, B ;
Huynen, M ;
Bork, P .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) :324-328
[8]   Horizontal gene transfer and the origin of species: lessons from bacteria [J].
de la Cruz, F ;
Davies, J .
TRENDS IN MICROBIOLOGY, 2000, 8 (03) :128-133
[9]  
desJardins M, 1997, ISMB-97 - FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY, PROCEEDINGS, P92
[10]   Protein interaction maps for complete genomes based on gene fusion events [J].
Enright, AJ ;
Iliopoulos, I ;
Kyrpides, NC ;
Ouzounis, CA .
NATURE, 1999, 402 (6757) :86-90