Modeling compositional heterogeneity

被引:341
作者
Foster, PG [1 ]
机构
[1] Nat Hist Museum, Dept Zool, London SW7 5BD, England
关键词
Compositional heterogeneity; Markov chain Monte Carlo; maximum likelihood; model assessment; model selection; phylogenetics;
D O I
10.1080/10635150490445779
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Compositional heterogeneity among lineages can compromise phylogenetic analyses, because models in common use assume compositionally homogeneous data. Models that can accommodate compositional heterogeneity with few extra parameters are described here, and used in two examples where the true tree is known with confidence. It is shown using likelihood ratio tests that adequate modeling of compositional heterogeneity can be achieved with few composition parameters, that the data may not need to be modelled with separate composition parameters for each branch in the tree. Tree searching and placement of composition vectors on the tree are done in a Bayesian framework using Markov chain Monte Carlo (MCMC) methods. Assessment of fit of the model to the data is made in both maximum likelihood (ML) and Bayesian frameworks. In an ML framework, overall model fit is assessed using the Goldman-Cox test, and the fit of the composition implied by a (possibly heterogeneous) model to the composition of the data is assessed using a novel tree- and model-based composition fit test. In a Bayesian framework, overall model fit and composition fit are assessed using posterior predictive simulation. It is shown that when composition is not accommodated, then the model does not fit, and incorrect trees are found; but when composition is accommodated, the model then fits, and the known correct phylogenies are obtained.
引用
收藏
页码:485 / 495
页数:11
相关论文
共 43 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
[Anonymous], 1981, Statistical Tables
[3]   Bayesian model adequacy and choice in phylogenetics [J].
Bollback, JP .
MOLECULAR BIOLOGY AND EVOLUTION, 2002, 19 (07) :1171-1180
[4]   Effects of nucleotide composition bias on the success of the parsimony criterion in phylogenetic inference [J].
Conant, GC ;
Lewis, PO .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (06) :1024-1033
[5]  
Eisen JA, 1995, J MOL EVOL, V41, P1105
[6]  
EMBLEY TM, 1993, SYST APPL MICROBIOL, V16, P25
[7]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[8]   Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions [J].
Foster, PG ;
Hickey, DA .
JOURNAL OF MOLECULAR EVOLUTION, 1999, 48 (03) :284-290
[9]   Inferring pattern and process: Maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis [J].
Galtier, N ;
Gouy, M .
MOLECULAR BIOLOGY AND EVOLUTION, 1998, 15 (07) :871-879
[10]   A nonhyperthermophilic common ancestor to extant life forms [J].
Galtier, N ;
Tourasse, N ;
Gouy, M .
SCIENCE, 1999, 283 (5399) :220-221