Correlation between sequence conservation and the genomic context after gene duplication

被引:37
作者
Notebaart, RA
Huynen, MA
Teusink, B
Siezen, RJ
Snel, B [1 ]
机构
[1] Radboud Univ Nijmegen, Ctr Mol & Biomol Informat, Nijmegen, Netherlands
[2] Radboud Univ Nijmegen, Ctr Mol Life Sci, Med Ctr, Nijmegen, Netherlands
[3] NIZO Food Res, Ede, Netherlands
[4] Wageningen Ctr Food Sci, Wageningen, Netherlands
关键词
D O I
10.1093/nar/gki913
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
A key complication in comparative genomics for reliable gene function prediction is the existence of duplicated genes. To study the effect of gene duplication on function prediction, we analyze orthologs between pairs of genomes where in one genome the orthologous gene has duplicated after the speciation of the two genomes (i.e. inparalogs). For these duplicated genes we investigate whether the gene that is most similar on the sequence level is also the gene that has retained the ancestral gene-neighborhood. Although the majority of investigated cases show a consistent pattern between sequence similarity and gene-neighborhood conservation, a substantial fraction, 29-38%, is inconsistent. The observation of inconsistency is not the result of a chance outcome owing to a lack of divergence time between inparalogs, but rather it seems to be the result of a chance outcome caused by very similar rates of sequence evolution of both inparalogs relative to their ortholog. If one-to-one orthologous relationships are required, it is advisable to combine contextual information (i.e. gene-neighborhood in prokaryotes and co-expression in eukaryotes) with protein sequence information to predict the most probable functional equivalent ortholog in the presence of inparalogs.
引用
收藏
页码:6164 / 6171
页数:8
相关论文
共 34 条
  • [1] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [2] MOLECULAR ANALYSIS OF 2 GENES OF THE ESCHERICHIA-COLI GAB CLUSTER - NUCLEOTIDE-SEQUENCE OF THE GLUTAMATE SUCCINIC SEMIALDEHYDE TRANSAMINASE GENE (GABT) AND CHARACTERIZATION OF THE SUCCINIC SEMIALDEHYDE DEHYDROGENASE GENE (GABD)
    BARTSCH, K
    VONJOHNNMARTEVILLE, A
    SCHULZ, A
    [J]. JOURNAL OF BACTERIOLOGY, 1990, 172 (12) : 7035 - 7042
  • [3] Conservation of gene order: a fingerprint of proteins that physically interact
    Dandekar, T
    Snel, B
    Huynen, M
    Bork, P
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) : 324 - 328
  • [4] MUSCLE: multiple sequence alignment with high accuracy and high throughput
    Edgar, RC
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 (05) : 1792 - 1797
  • [5] Protein interaction maps for complete genomes based on gene fusion events
    Enright, AJ
    Iliopoulos, I
    Kyrpides, NC
    Ouzounis, CA
    [J]. NATURE, 1999, 402 (6757) : 86 - 90
  • [6] DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS
    FITCH, WM
    [J]. SYSTEMATIC ZOOLOGY, 1970, 19 (02): : 99 - &
  • [7] Citrate synthase and 2-methylcitrate synthase: structural, functional and evolutionary relationships
    Gerike, U
    Hough, DW
    Russell, NJ
    Dyall-Smith, ML
    Danson, MJ
    [J]. MICROBIOLOGY-UK, 1998, 144 : 929 - 935
  • [8] The structural basis of molecular adaptation
    Golding, GB
    Dean, AM
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 1998, 15 (04) : 355 - 369
  • [9] Rapid evolution of expression and regulatory divergences after yeast gene duplication
    Gu, X
    Zhang, ZQ
    Huang, W
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (03) : 707 - 712
  • [10] A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood
    Guindon, S
    Gascuel, O
    [J]. SYSTEMATIC BIOLOGY, 2003, 52 (05) : 696 - 704