The Performance of the Date-Randomization Test in Phylogenetic Analyses of Time-Structured Virus Data

被引:135
作者
Duchene, Sebastian [1 ]
Duchene, David [2 ]
Holmes, Edward C. [1 ,3 ]
Ho, Simon Y. W. [1 ]
机构
[1] Univ Sydney, Sch Biol Sci, Sydney, NSW 2006, Australia
[2] Australian Natl Univ, Res Sch Biol, Canberra, ACT, Australia
[3] Univ Sydney, Sydney Med Sch, Charles Perkins Ctr, Marie Bashir Inst Infect Dis & Biosecur, Sydney, NSW 2006, Australia
基金
英国医学研究理事会; 澳大利亚研究理事会;
关键词
molecular clock; date-randomization test; tip calibrations; virus evolution; Bayesian phylogenetics; time-structured sequence data; ESTIMATING EVOLUTIONARY RATES; ANCIENT DNA; MOLECULAR EVOLUTION; SUBSTITUTION RATES; INFERENCE; CLOCK; SEQUENCES; HISTORY; DYNAMICS; REVEALS;
D O I
10.1093/molbev/msv056
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Rates and timescales of viral evolution can be estimated using phylogenetic analyses of time-structured molecular sequences. This involves the use of molecular-clock methods, calibrated by the sampling times of the viral sequences. However, the spread of these sampling times is not always sufficient to allow the substitution rate to be estimated accurately. We conducted Bayesian phylogenetic analyses of simulated virus data to evaluate the performance of the date-randomization test, which is sometimes used to investigate whether time-structured data sets have temporal signal. An estimate of the substitution rate passes this test if its mean does not fall within the 95% credible intervals of rate estimates obtained using replicate data sets in which the sampling times have been randomized. We find that the test sometimes fails to detect rate estimates from data with no temporal signal. This error can be minimized by using a more conservative criterion, whereby the 95% credible interval of the estimate with correct sampling times should not overlap with those obtained with randomized sampling times. We also investigated the behavior of the test when the sampling times are not uniformly distributed throughout the tree, which sometimes occurs in empirical data sets. The test performs poorly in these circumstances, such that a modification to the randomization scheme is needed. Finally, we illustrate the behavior of the test in analyses of nucleotide sequences of cereal yellow dwarf virus. Our results validate the use of the date-randomization test and allow us to propose guidelines for interpretation of its results.
引用
收藏
页码:1895 / 1906
页数:12
相关论文
共 52 条
[1]   BEAST 2: A Software Platform for Bayesian Evolutionary Analysis [J].
Bouckaert, Remco ;
Heled, Joseph ;
Kuehnert, Denise ;
Vaughan, Tim ;
Wu, Chieh-Hsi ;
Xie, Dong ;
Suchard, Marc A. ;
Rambaut, Andrew ;
Drummond, Alexei J. .
PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (04)
[2]   Time Dependency of Molecular Rates in Ancient DNA Data Sets, A Sampling Artifact? [J].
Debruyne, Regis ;
Poinar, Hendrik N. .
SYSTEMATIC BIOLOGY, 2009, 58 (03) :348-359
[3]   Inference of viral evolutionary rates from molecular sequences [J].
Drummond, A ;
Pybus, OG ;
Rambaut, A .
ADVANCES IN PARASITOLOGY, VOL 54: THE EVOLUTION OF PARASITISM-A PHYLOGENETIC PERSPECTIVE, 2003, 54 :331-358
[4]   The inference of stepwise changes in substitution rates using serial sequence samples [J].
Drummond, A ;
Forsberg, R ;
Rodrigo, AG .
MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (07) :1365-1371
[5]   Reconstructing genealogies of serial samples under the assumption of a molecular clock using serial-sample UPGMA [J].
Drummond, A ;
Rodrigo, AG .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (12) :1807-1815
[6]   Measurably evolving populations [J].
Drummond, AJ ;
Pybus, OG ;
Rambaut, A ;
Forsberg, R ;
Rodrigo, AG .
TRENDS IN ECOLOGY & EVOLUTION, 2003, 18 (09) :481-488
[7]  
Drummond AJ, 2005, MOL BIOL EVOL, V22, P1185, DOI [10.1093/molbev/msi103, 10.1093/molbev/mss075]
[8]  
Drummond AJ, 2002, GENETICS, V161, P1307
[9]   Relaxed phylogenetics and dating with confidence [J].
Drummond, Alexei J. ;
Ho, Simon Y. W. ;
Phillips, Matthew J. ;
Rambaut, Andrew .
PLOS BIOLOGY, 2006, 4 (05) :699-710
[10]   Tree imbalance causes a bias in phylogenetic estimation of evolutionary timescales using heterochronous sequences [J].
Duchene, David ;
Duchene, Sebastian ;
Ho, Simon Y. W. .
MOLECULAR ECOLOGY RESOURCES, 2015, 15 (04) :785-794