From complete genomes to measures of substitution rate variability within and between proteins

被引:61
作者
Grishin, NV
Wolf, YI
Koonin, EV
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
[2] Russian Acad Sci, Inst Cytol & Genet, Novosibirsk 630090, Russia
关键词
D O I
10.1101/gr.10.7.991
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Accumulation of complete genome sequences of diverse organisms creates new possibilities for evolutionary inferences from whole-genome comparisons. In the present study, we analyze the distributions of substitution rates among proteins encoded in 19 complete genomes (the interprotein rate distribution). To estimate these rates, it is necessary to employ another fundamental distribution, that of the substitution rates among sites in proteins (the intraprotein distribution]. Using two independent approaches, we show that intraprotein substitution rate variability appears to be significantly greater than generally accepted. This yields more realistic estimates of evolutionary distances from amino-acid sequences, which is critical for evolutionary-tree construction. We demonstrate that the interprotein rate distributions inferred From the genome-to-genome comparisons are similar to each other and can be approximated by a single distribution with a long exponential shoulder. This suggests that a generalized version of the molecular clock hypothesis may be valid on genome scale. We also use the scaling parameter of the obtained interprotein rate distribution to construct a rooted whole-genome phylogeny. The topology of the resulting tree is largely compatible with those of global rRNA-based trees and trees produced by other approaches to genome-wide comparison.
引用
收藏
页码:991 / 1000
页数:10
相关论文
共 44 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] The genome sequence of Rickettsia prowazekii and the origin of mitochondria
    Andersson, SGE
    Zomorodipour, A
    Andersson, JO
    Sicheritz-Pontén, T
    Alsmark, UCM
    Podowski, RM
    Näslund, AK
    Eriksson, AS
    Winkler, HH
    Kurland, CG
    [J]. NATURE, 1998, 396 (6707) : 133 - 140
  • [3] [Anonymous], 1978, Atlas of protein sequence and structure
  • [4] Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins
    Bateman, A
    Birney, E
    Durbin, R
    Eddy, SR
    Finn, RD
    Sonnhammer, ELL
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 260 - 262
  • [5] Archaea and the prokaryote-to-eukaryote transition
    Brown, JR
    Doolittle, WF
    [J]. MICROBIOLOGY AND MOLECULAR BIOLOGY REVIEWS, 1997, 61 (04) : 456 - +
  • [6] Determining divergence times of the major kingdoms of living organisms with a protein clock
    Doolittle, RF
    Feng, DF
    Tsang, S
    Cho, G
    Little, E
    [J]. SCIENCE, 1996, 271 (5248) : 470 - 477
  • [7] Phylogenetic classification and the universal tree
    Doolittle, WF
    [J]. SCIENCE, 1999, 284 (5423) : 2124 - 2128
  • [8] Felsenstein J, 1996, METHOD ENZYMOL, V266, P418
  • [9] Converting amino acid alignment scores into measures of evolutionary time: A simulation study of various relationships
    Feng, DF
    Doolittle, RF
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1997, 44 (04) : 361 - 370
  • [10] Determining divergence times with a protein clock: Update and reevaluation
    Feng, DF
    Cho, G
    Doolittle, RF
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1997, 94 (24) : 13028 - 13033