Global phylogeny determined by the combination of protein domains in proteomes

被引:67
作者
Wang, Minglei [1 ]
Caetano-Anolles, Gustavo [1 ]
机构
[1] Univ Illinois, Dept Crop Sci, Urbana, IL 61801 USA
关键词
protein domains; evolution; phylogenomics; combinations;
D O I
10.1093/molbev/msl117
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The majority of proteins consist of multiple domains that are either repeated or combined in defined order. In this study, we survey the combination of protein domains defined at fold and fold superfamily levels in 185 genomes belonging to organisms that have been fully sequenced and introduce a method that reconstructs rooted phylogenomic trees from the content and arrangement of domains in proteins at a genomic level. We find that the majority of domain combinations were unique to Archaea, Bacteria, or Eukarya, suggesting most combinations originated after life had diversified. Domain repeat and domain repeat within multidomain proteins increased notably in eukaryotes, mainly at the expense of single-domain and domain-pair proteins. This increase was mostly confined to Metazoa. We also find an unbalanced sharing of domain combinations which suggests that Eukarya is more closely related to Bacteria than to Archaea, an observation that challenges the widely assumed eukaryote-archaebacterial sisterhood relationship. The occurrence and abundance of the molecular repertoire (interactome) of domain combinations was used to generate phylogenomic trees. These global interactome-based phylogenies described organismal histories satisfactorily, revealing the tripartite nature of life, and supporting controversial evolutionary patterns, such as the Coelomata hypothesis, the grouping of plants and animals, and the Gram-positive origin of bacteria. Results suggest strongly that the process of domain combination is not random but curved by evolution, rejecting the null hypothesis of domain modules combining in the absence of natural selection or an optimality criterion.
引用
收藏
页码:2444 / 2454
页数:11
相关论文
共 54 条
  • [1] Apic G, 2001, Bioinformatics, V17 Suppl 1, pS83
  • [2] Domain combinations in archaeal, eubacterial and eukaryotic proteomes
    Apic, G
    Gough, J
    Teichmann, SA
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2001, 310 (02) : 311 - 325
  • [3] Apic Gordana, 2003, Journal of Structural and Functional Genomics, V4, P67, DOI 10.1023/A:1026113408773
  • [4] A kingdom-level phylogeny of eukaryotes based on combined protein data
    Baldauf, SL
    Roger, AJ
    Wenk-Siefert, I
    Doolittle, WF
    [J]. SCIENCE, 2000, 290 (5493) : 972 - 977
  • [5] The geometry of domain combination in proteins
    Bashton, M
    Chothia, C
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2002, 315 (04) : 927 - 939
  • [6] The ASTRAL compendium for protein structure and sequence analysis
    Brenner, SE
    Koehl, P
    Levitt, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 254 - 256
  • [7] Universal trees based on large combined protein sequence data sets
    Brown, JR
    Douady, CJ
    Italia, MJ
    Marshall, WE
    Stanhope, MJ
    [J]. NATURE GENETICS, 2001, 28 (03) : 281 - 285
  • [8] Universal sharing patterns in proteomes and evolution of protein fold architecture and life
    Caetano-Anollés, G
    Caetano-Anollés, D
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 2005, 60 (04) : 484 - 498
  • [9] An evolutionarily structured universe of protein architecture
    Caetano-Anollés, G
    Caetano-Anollés, D
    [J]. GENOME RESEARCH, 2003, 13 (07) : 1563 - 1571
  • [10] PROTEINS - 1000 FAMILIES FOR THE MOLECULAR BIOLOGIST
    CHOTHIA, C
    [J]. NATURE, 1992, 357 (6379) : 543 - 544