Connected gene neighborhoods in prokaryotic genomes

被引:148
作者
Rogozin, IB
Makarova, KS
Murvai, J
Czabarka, E
Wolf, YI
Tatusov, RL
Szekely, LA
Koonin, EV [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
[2] Univ S Carolina, Dept Math, Columbia, SC 29208 USA
关键词
D O I
10.1093/nar/30.10.2212
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A computational method was developed for delineating connected gene neighborhoods in bacterial and archaeal genomes. These gene neighborhoods are not typically present, in their entirety, in any single genome, but are held together by overlapping, partially conserved gene arrays. The procedure was applied to comparing the orders of orthologous genes, which were extracted from the database of Clusters of Orthologous Groups of proteins (COGs), in 31 prokaryotic genomes and resulted in the identification of 188 clusters of gene arrays, which included 1001 of 2890 COGs. These clusters were projected onto actual genomes to produce extended neighborhoods including additional genes, which are adjacent to the genes from the clusters and are transcribed in the same direction, which resulted in a total of 2387 COGs being included in the neighborhoods. Most of the neighborhoods consist predominantly of genes united by a coherent functional theme, but also include a minority of genes without an obvious functional connection to the main theme. We hypothesize that although some of the latter genes might have unsuspected roles, others are maintained within gene arrays because of the advantage of expression at a level that is typical of the given neighborhood. We designate this phenomenon 'genomic hitchhiking'. The largest neighborhood includes 79 genes (COGs) and consists of overlapping, rearranged ribosomal protein superoperons; apparent genome hitchhiking is particularly typical of this neighborhood and other neighborhoods that consist of genes coding for translation machinery components. Several neighborhoods involve previously undetected connections between genes, allowing new functional predictions. Gene neighborhoods appear to evolve via complex rearrangement, with different combinations of genes from a neighborhood fixed in different lineages.
引用
收藏
页码:2212 / 2223
页数:12
相关论文
共 37 条
[1]  
[Anonymous], 1998, GRAPH THEORY ITS APP
[2]   Conservation of gene order: a fingerprint of proteins that physically interact [J].
Dandekar, T ;
Snel, B ;
Huynen, M ;
Bork, P .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) :324-328
[3]   Prediction of operons in microbial genomes [J].
Ermolaeva, MD ;
White, O ;
Salzberg, SL .
NUCLEIC ACIDS RESEARCH, 2001, 29 (05) :1216-1221
[4]   Automatic detection of conserved gene clusters in multiple genomes by graph comparison and P-quasi grouping [J].
Fujibuchi, W ;
Ogata, H ;
Matsuda, H ;
Kanehisa, M .
NUCLEIC ACIDS RESEARCH, 2000, 28 (20) :4029-4036
[5]   Who's your neighbor? New computational approaches for functional genomics [J].
Galperin, MY ;
Koonin, EV .
NATURE BIOTECHNOLOGY, 2000, 18 (06) :609-613
[6]   Predicting protein function by genomic context: Quantitative evaluation and qualitative inferences [J].
Huynen, M ;
Snel, B ;
Lathe, W ;
Bork, P .
GENOME RESEARCH, 2000, 10 (08) :1204-1210
[7]   Exploitation of gene context [J].
Huynen, M ;
Snel, B ;
Lathe, W ;
Bork, P .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2000, 10 (03) :366-370
[8]   Gene and context: Integrative approaches to genome analysis [J].
Huynen, MA ;
Snel, B .
ADVANCES IN PROTEIN CHEMISTRY, VOL 54: ANALYSIS OF AMINO ACID SEQUENCES, 2000, 54 :345-379
[9]   GENETIC REGULATORY MECHANISMS IN SYNTHESIS OF PROTEINS [J].
JACOB, F ;
MONOD, J .
JOURNAL OF MOLECULAR BIOLOGY, 1961, 3 (03) :318-+
[10]  
JACOB F, 1960, CR HEBD ACAD SCI, V250, P1727