Multiple sequence alignment accuracy and phylogenetic inference

被引:174
作者
Ogden, TH [1 ]
Rosenberg, MS
机构
[1] Arizona State Univ, Biodesign Inst, Ctr Evolut Funct Genom, Tempe, AZ 85287 USA
[2] Arizona State Univ, Sch Life Sci, Tempe, AZ 85287 USA
关键词
Bayesian; maximum likelihood; maximum parsimony; multiple sequence alignment; neighbor joining; phylogenetics; simulation; tree reconstruction;
D O I
10.1080/10635150500541730
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phylogenies are often thought to be more dependent upon the species of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.
引用
收藏
页码:314 / 328
页数:15
相关论文
共 74 条
[1]  
[Anonymous], 1918, ELEMENTS STYLE
[2]   A review of long-branch attraction [J].
Bergsten, J .
CLADISTICS, 2005, 21 (02) :163-193
[3]   The Archaea monophyly issue: A phylogeny of translational elongation factor G(2) sequences inferred from an optimized selection of alignment positions [J].
Cammarano, P ;
Creti, R ;
Sanangelantoni, AM ;
Palm, P .
JOURNAL OF MOLECULAR EVOLUTION, 1999, 49 (04) :524-537
[4]   A further note on symmetry of taxonomic trees [J].
Colless, DH .
SYSTEMATIC BIOLOGY, 1996, 45 (03) :385-390
[5]   MSARI: Multiple sequence alignments for statistical detection of RNA secondary structure [J].
Coventry, A ;
Kleitman, DJ ;
Berger, B .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (33) :12102-12107
[6]  
DEPINNA MCC, 1991, CLADISTICS, V7, P367
[7]   ddbRNA: detection of conserved secondary structures in multiple alignments [J].
di Bernardo, D ;
Down, T ;
Hubbard, T .
BIOINFORMATICS, 2003, 19 (13) :1606-1611
[8]   Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction [J].
Dowell, RD ;
Eddy, SR .
BMC BIOINFORMATICS, 2004, 5 (1)
[9]  
Farris JS, 1998, CLADISTICS, V14, P159, DOI 10.1111/j.1096-0031.1998.tb00329.x
[10]   CASES IN WHICH PARSIMONY OR COMPATIBILITY METHODS WILL BE POSITIVELY MISLEADING [J].
FELSENSTEIN, J .
SYSTEMATIC ZOOLOGY, 1978, 27 (04) :401-410