The effect of taxon sampling on estimating rate heterogeneity parameters of maximum-likelihood models

被引:135
作者
Sullivan, J [1 ]
Swofford, DL
Naylor, GJP
机构
[1] Univ Idaho, Dept Sci Biol, Moscow, ID 83844 USA
[2] Iowa State Univ, Dept Zool, Ames, IA 50011 USA
[3] Smithsonian Inst, Lab Mol Systemat, Washington, DC 20560 USA
关键词
rate heterogeneity; parameter estimation; maximum likelihood; molecular phylogeny; parametric bootstrap; simulation; stationarity;
D O I
10.1093/oxfordjournals.molbev.a026045
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
As maximum-likelihood approaches to the study of molecular systematics and evolution become both more flexible and more accessible, the importance of understanding the statistical properties of parameter estimation becomes critical. Using variation in NADH-2 sequences for 40 species of requiem sharks, we illustrate that estimates of rate heterogeneity parameters are highly sensitive to taxon sampling when the data are best explained by a mixed-distribution model of among-site rate variation (invariable sites plus gamma distribution [I+Gamma]). Using computer simulation, we attempt to differentiate two possible causes of this sensitivity. While the possibility of nonstationarity cannot be definitively rejected, our results suggest that sampling error alone provides an adequate explanation for the pattern of uncertainty observed in estimates from the real data. Furthermore, we illustrate that two parameters estimated under the I+Gamma model (the proportion of sites not free to change and the gamma distribution shape parameter) are highly correlated and that the likelihood surface across the rate heterogeneity parameter space can be poorly behaved when only a small number of sequences (taxa) are considered.
引用
收藏
页码:1347 / 1356
页数:10
相关论文
共 37 条