Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes

被引:4
作者
Baele, Guy [1 ,2 ,3 ]
Van de Peer, Yves [1 ,2 ]
Vansteelandt, Stijn [4 ]
机构
[1] VIB, Dept Plant Syst Biol, B-9052 Ghent, Belgium
[2] Univ Ghent, Dept Mol Genet, B-9052 Ghent, Belgium
[3] Katholieke Univ Leuven, Rega Inst, Dept Microbiol & Immunol, B-3000 Louvain, Belgium
[4] Univ Ghent, Dept Appl Math & Comp Sci, B-9000 Ghent, Belgium
来源
BMC EVOLUTIONARY BIOLOGY | 2011年 / 11卷
基金
欧洲研究理事会;
关键词
NEIGHBORING BASE COMPOSITION; NUCLEOTIDE SUBSTITUTION; POSTERIOR DISTRIBUTIONS; PHYLOGENY; PATTERNS; IMPACT; BIAS;
D O I
10.1186/1471-2148-11-145
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Accurate modelling of substitution processes in protein-coding sequences is often hampered by the computational burdens associated with full codon models. Lately, codon partition models have been proposed as a viable alternative, mimicking the substitution behaviour of codon models at a low computational cost. Such codon partition models however impose independent evolution of the different codon positions, which is overly restrictive from a biological point of view. Given that empirical research has provided indications of context-dependent substitution patterns at four-fold degenerate sites, we take those indications into account in this paper. Results: We present so-called context-dependent codon partition models to assess previous empirical claims that the evolution of four-fold degenerate sites is strongly dependent on the composition of its two flanking bases. To this end, we have estimated and compared various existing independent models, codon models, codon partition models and context-dependent codon partition models for the atpB and rbcL genes of the chloroplast genome, which are frequently used in plant systematics. Such context-dependent codon partition models employ a full dependency scheme for four-fold degenerate sites, whilst maintaining the independence assumption for the first and second codon positions. Conclusions: We show that, both in the atpB and rbcL alignments of a collection of land plants, these context-dependent codon partition models significantly improve model fit over existing codon partition models. Using Bayes factors based on thermodynamic integration, we show that in both datasets the same context-dependent codon partition model yields the largest increase in model fit compared to an independent evolutionary model. Context-dependent codon partition models hence perform closer to codon models, which remain the best performing models at a drastically increased computational cost, compared to codon partition models, but remain computationally interesting alternatives to codon models. Finally, we observe that the substitution patterns in both datasets are drastically different, leading to the conclusion that combined analysis of these two genes using a single model may not be advisable from a context-dependent point of view.
引用
收藏
页数:17
相关论文
共 36 条
[1]   Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models [J].
Anisimova, Maria ;
Kosiol, Carolin .
MOLECULAR BIOLOGY AND EVOLUTION, 2009, 26 (02) :255-271
[2]  
[Anonymous], 2004, Inferring phylogenies
[3]  
[Anonymous], 1969, EVOLUTION PROTEIN MO
[4]  
[Anonymous], 1995, Markov Chain Monte Carlo in Practice
[5]   A Model-Based Approach to Study Nearest-Neighbor Influences Reveals Complex Substitution Patterns in Non-coding Sequences [J].
Baele, Guy ;
Van de Peer, Yves ;
Vansteelandt, Stijn .
SYSTEMATIC BIOLOGY, 2008, 57 (05) :675-692
[6]   Modelling the ancestral sequence distribution and model frequencies in context-dependent models for primate non-coding sequences [J].
Baele, Guy ;
Van de Peer, Yves ;
Vansteelandt, Stijn .
BMC EVOLUTIONARY BIOLOGY, 2010, 10
[7]   Using Non-Reversible Context-Dependent Evolutionary Models to Study Substitution Patterns in Primate Non-Coding Sequences [J].
Baele, Guy ;
Van de Peer, Yves ;
Vansteelandt, Stijn .
JOURNAL OF MOLECULAR EVOLUTION, 2010, 71 (01) :34-50
[8]   Efficient context-dependent model building based on clustering posterior distributions for non-coding sequences [J].
Baele, Guy ;
Van de Peer, Yves ;
Vansteelandt, Stijn .
BMC EVOLUTIONARY BIOLOGY, 2009, 9
[9]   Brassicaceae phylogeny and trichome evolution [J].
Beilstein, MA ;
Al-Shehbaz, IA ;
Kellogg, EA .
AMERICAN JOURNAL OF BOTANY, 2006, 93 (04) :607-619
[10]   Estimation of hominoid ancestral population sizes under Bayesian coalescent models incorporating mutation rate variation and sequencing errors [J].
Burgess, Ralph ;
Yang, Ziheng .
MOLECULAR BIOLOGY AND EVOLUTION, 2008, 25 (09) :1979-1994