The importance of data partitioning and the utility of bayes factors in Bayesian phylogenetics

被引:276
作者
Brown, Jeremy M. [1 ]
Lemmon, Alan R. [1 ]
机构
[1] Univ Texas, Sect Integrat Biol, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
D O I
10.1080/10635150701546249
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
As larger, more complex data sets are being used to infer phylogenies, accuracy of these phylogenies increasingly requires models of evolution that accommodate heterogeneity in the processes of molecular evolution. We investigated the effect of improper data partitioning on phylogenetic accuracy, as well as the type I error rate and sensitivity of Bayes factors, a commonly used method for choosing among different partitioning strategies in Bayesian analyses. We also used Bayes factors to test empirical data for the need to divide data in a manner that has no expected biological meaning. Posterior probability estimates are misleading when an incorrect partitioning strategy is assumed. The error was greatest when the assumed model was underpartitioned. These results suggest that model partitioning is important for large data sets. Bayes factors performed well, giving a 5% type I error rate, which is remarkably consistent with standard frequentist hypothesis tests. The sensitivity of Bayes factors was found to be quite high when the across- class model heterogeneity reflected that of empirical data. These results suggest that Bayes factors represent a robust method of choosing among partitioning strategies. Lastly, results of tests for the inclusion of unexpected divisions in empirical data mirrored the simulation results, although the outcome of such tests is highly dependent on accounting for rate variation among classes. We conclude by discussing other approaches for partitioning data, as well as other applications of Bayes factors.
引用
收藏
页码:643 / 655
页数:13
相关论文
共 26 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
[Anonymous], 2000, MATH BOOK
[3]   Partitioned Bayesian analyses, partition choice, and the phylogenetic relationships of scincid lizards [J].
Brandley, MC ;
Schmitz, A ;
Reeder, TW .
SYSTEMATIC BIOLOGY, 2005, 54 (03) :373-390
[4]   Bayesian mixed models and the phylogeny of pitvipers (Viperidae: Serpentes) [J].
Castoe, TA ;
Parkinson, CL .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2006, 39 (01) :91-110
[5]   Data partitions and complex models in Bayesian analysis: The phylogeny of Gymnophthalmid lizards [J].
Castoe, TA ;
Doan, TM ;
Parkinson, CL .
SYSTEMATIC BIOLOGY, 2004, 53 (03) :448-469
[6]   CASES IN WHICH PARSIMONY OR COMPATIBILITY METHODS WILL BE POSITIVELY MISLEADING [J].
FELSENSTEIN, J .
SYSTEMATIC ZOOLOGY, 1978, 27 (04) :401-410
[7]   Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models [J].
Huelsenbeck, JP ;
Rannala, B .
SYSTEMATIC BIOLOGY, 2004, 53 (06) :904-913
[8]   SUCCESS OF PHYLOGENETIC METHODS IN THE 4-TAXON CASE [J].
HUELSENBECK, JP ;
HILLIS, DM .
SYSTEMATIC BIOLOGY, 1993, 42 (03) :247-264
[9]   Some tests of significance, treated by the theory of probability [J].
Jeffreys, H .
PROCEEDINGS OF THE CAMBRIDGE PHILOSOPHICAL SOCIETY, 1935, 31 :203-222
[10]  
Jeffreys H., 1998, The Theory of Probability