Modeling the evolution of protein domain architectures using maximum parsimony

被引:79
作者
Fong, Jessica H. [1 ]
Geer, Lewis Y. [1 ]
Panchenko, Anna R. [1 ]
Bryant, Stephen H. [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
protein domain architecture; multi-domain proteins; domain recombination; domain evolution; protein family;
D O I
10.1016/j.jmb.2006.11.017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture "neighbors" identified in this way may lead to new insights about the evolution of protein function. Published by Elsevier Ltd.
引用
收藏
页码:307 / 315
页数:9
相关论文
共 37 条
[1]  
[Anonymous], CURR OPIN STRUCT BIO
[2]  
Apic G, 2001, Bioinformatics, V17 Suppl 1, pS83
[3]  
Apic Gordana, 2003, Journal of Structural and Functional Genomics, V4, P67, DOI 10.1023/A:1026113408773
[4]   The geometry of domain combination in proteins [J].
Bashton, M ;
Chothia, C .
JOURNAL OF MOLECULAR BIOLOGY, 2002, 315 (04) :927-939
[5]  
Bateman A, 2002, NUCLEIC ACIDS RES, V30, P276, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
[6]   The evolution of domain arrangements in proteins and interaction networks [J].
Bornberg-Bauer, E ;
Beaussart, F ;
Kummerfeld, S ;
Teichmann, S ;
Weiner, J .
CELLULAR AND MOLECULAR LIFE SCIENCES, 2005, 62 (04) :435-445
[7]   Evolution of the protein repertoire [J].
Chothia, C ;
Gough, J ;
Vogel, C ;
Teichmann, SA .
SCIENCE, 2003, 300 (5626) :1701-1703
[8]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[9]   Protein interaction maps for complete genomes based on gene fusion events [J].
Enright, AJ ;
Iliopoulos, I ;
Kyrpides, NC ;
Ouzounis, CA .
NATURE, 1999, 402 (6757) :86-90
[10]  
Enright AJ, 2001, GENOME BIOL, V2