Computing Bayes factors using thermodynamic integration

被引:494
作者
Lartillot, N
Philippe, H
机构
[1] Univ Montpellier 2, CNRS, Lab Informat Robot & Microelect Montpellier, UMR 5506, F-34392 Montpellier 5, France
[2] Univ Montreal, Dept Biochim, Canadian Inst Adv Res, Montreal, PQ H3C 3J7, Canada
关键词
Bayes factor; harmonic mean; mixture model; path sampling; phylogeny; thermodynamic integration;
D O I
10.1080/10635150500433722
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present article, we propose to employ another method, based on an analogy with statistical physics, called thermodynamic integration. We describe the method, propose an implementation, and show on two analytical examples that this numerical method yields reliable estimates. In contrast, the harmonic mean estimator leads to a strong overestimation of the marginal likelihood, which is all the more pronounced as the model is higher dimensional. As a result, the harmonic mean estimator systematically favors more parameter-rich models, an artefact that might explain some recent puzzling observations, based on harmonic mean estimates, suggesting that Bayes factors tend to overscore complex models. Finally, we apply our method to the comparison of several alternative models of amino-acid replacement. We confirm our previous observations, indicating that modeling pattern heterogeneity across sites tends to yield better models than standard empirical matrices.
引用
收藏
页码:195 / 207
页数:13
相关论文
共 57 条
[1]  
AITKIN M, 1991, J ROY STAT SOC B MET, V53, P111
[2]   Parallel metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference [J].
Altekar, G ;
Dwarkadas, S ;
Huelsenbeck, JP ;
Ronquist, F .
BIOINFORMATICS, 2004, 20 (03) :407-415
[3]  
[Anonymous], 2003, PROBABILITY THEORY
[4]  
[Anonymous], 2021, Bayesian Data Analysis
[5]  
[Anonymous], 1994, 568 U MINN SCH STAT
[6]   The intrinsic Bayes factor for model selection and prediction [J].
Berger, JO ;
Pericchi, LR .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (433) :109-122
[7]   Bayesian model adequacy and choice in phylogenetics [J].
Bollback, JP .
MOLECULAR BIOLOGY AND EVOLUTION, 2002, 19 (07) :1171-1180
[8]   An empirical assessment of long-branch attraction artefacts in deep eukaryotic phylogenomics [J].
Brinkmann, H ;
Van der Giezen, M ;
Zhou, Y ;
De Raucourt, GP ;
Philippe, H .
SYSTEMATIC BIOLOGY, 2005, 54 (05) :743-757
[9]   Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis [J].
Castresana, J .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (04) :540-552
[10]   Marginal likelihood from the Gibbs output [J].
Chib, S .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (432) :1313-1321