Evaluating the performance of a successive-approximations approach to parameter optimization in maximum-likelihood phylogeny estimation

被引:101
作者
Sullivan, J [1 ]
Abdo, Z
Joyce, P
Swofford, DL
机构
[1] Univ Idaho, Dept Biol Sci, Moscow, ID 83843 USA
[2] Univ Idaho, Program Bioinformat & Computat Biol, Moscow, ID 83843 USA
[3] Univ Idaho, Dept Math, Moscow, ID 83843 USA
[4] Florida State Univ, Sch Computat Sci, Tallahassee, FL 32306 USA
[5] Florida State Univ, Dept Biol Sci, Tallahassee, FL 32306 USA
关键词
maximum likelihood; models; phylogeny; successive approximations; parameter estimation;
D O I
10.1093/molbev/msi129
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Almost all studies that estimate phylogenies from DNA sequence data under the maximum-likelihood (ML) criterion employ an approximate approach. Most commonly, model parameters are estimated on some initial phylogenetic estimate derived using a rapid method (neighbor-joining or parsimony). Parameters are then held constant during a tree search, and ideally, the procedure is repeated until convergence is achieved. However, the effectiveness of this approximation has not been formally assessed, in part because doing so requires computationally intensive, full-optimization analyses. Here, we report both indirect and direct evaluations of the effectiveness of successive approximations. We obtained an indirect evaluation by comparing the results of replicate runs on real data that use random trees to provide initial parameter estimates. For six real data sets taken from the literature, all replicate iterative searches converged to the same joint estimates of topology and model parameters, suggesting that the approximation is not starting-point dependent, as long as the heuristic searches of tree space are rigorous. We conducted a more direct assessment using simulations in which we compared the accuracy of phylogenies estimated using full optimization of all model parameters on each tree evaluated to the accuracy of trees estimated via successive approximations. There is no significant difference between the accuracy of the approximation searches relative to full-optimization searches. Our results demonstrate that successive approximation is reliable and provide reassurance that this much faster approach is safe to use for ML estimation of topology.
引用
收藏
页码:1386 / 1392
页数:7
相关论文
共 21 条
[1]   Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation [J].
Abdo, Z ;
Minin, VN ;
Joyce, P ;
Sullivan, J .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (03) :691-703
[2]   The guinea-pig is not a rodent [J].
DErchia, AM ;
Gissi, C ;
Pesole, G ;
Saccone, C ;
Arnason, U .
NATURE, 1996, 381 (6583) :597-600
[3]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[4]  
Felsenstein Joseph, 2004, Inferring_phylogenies, V2
[5]   Evolution of the mitochondrial cytochrome oxidase II gene in collembola [J].
Frati, F ;
Simon, C ;
Sullivan, J ;
Swofford, DL .
JOURNAL OF MOLECULAR EVOLUTION, 1997, 44 (02) :145-158
[6]   A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood [J].
Guindon, S ;
Gascuel, O .
SYSTEMATIC BIOLOGY, 2003, 52 (05) :696-704
[7]  
Jukes TH, 1969, MAMMALIAN PROTEIN ME, P21, DOI [DOI 10.1016/B978-1-4832-3211-9.50009-7, DOI 10.1093/BIOINFORMATICS/BTM404]
[9]   28S and 18S rDNA sequences support the monophyly of lampreys and hagfishes [J].
Mallat, J ;
Sullivan, J .
MOLECULAR BIOLOGY AND EVOLUTION, 1998, 15 (12) :1706-1718
[10]   Granule-bound starch synthase: Structure, function, and phylogenetic utility [J].
Mason-Gamer, RJ ;
Weil, CF ;
Kellogg, EA .
MOLECULAR BIOLOGY AND EVOLUTION, 1998, 15 (12) :1658-1673