Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

被引:61
作者
Auch, Alexander F.
Henz, Stefan R.
Holland, Barbara R.
Goeker, Markus
机构
[1] Univ Tubingen, Ctr Bioinformat, ZBIT, D-72074 Tubingen, Germany
[2] Max Planck Inst Dev Biol, Tubingen, Germany
[3] Massey Univ, Allan Wilson Ctr Mol Ecol & Evolut, Palmerston North, New Zealand
[4] Univ Tubingen, Organism Bot Mycol, D-72074 Tubingen, Germany
关键词
D O I
10.1186/1471-2105-7-350
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP) strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX, or combinations of both are used to locate high- scoring segment pairs (HSPs) between two sequences from which pairwise similarities and distances are computed in different ways resulting in a total of 96 GBDP variants. The suitability of these distance formulae for phylogeny reconstruction is directly estimated by computing a recently described measure of "treelikeness", the so-called delta value, from the respective distance matrices. Additionally, we compare the trees inferred from these matrices using UPGMA, NJ, BIONJ, FastME, or STC, respectively, with the NCBI taxonomy tree of the taxa under study. Results: Our results indicate that, at this taxonomic level, plastid genomes are much more valuable for inferring phylogenies than are mitochondrial genomes, and that distances based on breakpoints are of little use. Distances based on the proportion of "matched" HSP length to average genome length were best for tree estimation. Additionally we found that using TBLASTX instead of BLASTN and, particularly, combining TBLASTX and BLASTN leads to a small but significant increase in accuracy. Other factors do not significantly affect the phylogenetic outcome. The BIONJ algorithm results in phylogenies most in accordance with the current NCBI taxonomy, with NJ and FastME performing insignificantly worse, and STC performing as well if applied to high quality distance matrices. delta values are found to be a reliable predictor of phylogenetic accuracy. Conclusion: Using the most treelike distance matrices, as judged by their delta values, distance methods are able to recover all major plant lineages, and are more in accordance with Apicomplexa organelles being derived from "green" plastids than from plastids of the "red" type. GBDP-like methods can be used to reliably infer phylogenies from different kinds of genomic data. A framework is established to further develop and improve such methods. delta values are a topology-independent tool of general use for the development and assessment of distance methods for phylogenetic inference.
引用
收藏
页数:16
相关论文
共 80 条
[1]   The new higher level classification of eukaryotes with emphasis on the taxonomy of protists [J].
Adl, SM ;
Simpson, AGB ;
Farmer, MA ;
Andersen, RA ;
Anderson, OR ;
Barta, JR ;
Bowser, SS ;
Brugerolle, G ;
Fensome, RA ;
Fredericq, S ;
James, TY ;
Karpov, S ;
Kugrens, P ;
Krug, J ;
Lane, CE ;
Lewis, LA ;
Lodge, J ;
Lynn, DH ;
Mann, DG ;
McCourt, RM ;
Mendoza, L ;
Moestrup, O ;
Mozley-Standridge, SE ;
Nerad, TA ;
Shearer, CA ;
Smirnov, AV ;
Spiegel, FW ;
Taylor, MFJR .
JOURNAL OF EUKARYOTIC MICROBIOLOGY, 2005, 52 (05) :399-451
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]   Chlorophyll c-containing plastid relationships based on analyses of a multigene data set with all four chromalveolate lineages [J].
Bachvaroff, TR ;
Puerta, MVS ;
Delwiche, CF .
MOLECULAR BIOLOGY AND EVOLUTION, 2005, 22 (09) :1772-1782
[4]   A CANONICAL DECOMPOSITION-THEORY FOR METRICS ON A FINITE-SET [J].
BANDELT, HJ ;
DRESS, AWM .
ADVANCES IN MATHEMATICS, 1992, 92 (01) :47-105
[5]   Neighbor-Net: An agglomerative method for the construction of phylogenetic networks [J].
Bryant, D ;
Moulton, V .
MOLECULAR BIOLOGY AND EVOLUTION, 2004, 21 (02) :255-265
[6]  
Buneman P., 1971, Mathematics in the Archaeological and Historical Sciences, P387
[7]  
Charlebois RL, 2004, SYST ASSOC SPEC VOL, P189
[8]   Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores [J].
Clarke, GDP ;
Beiko, RG ;
Ragan, MA ;
Charlebois, RL .
JOURNAL OF BACTERIOLOGY, 2002, 184 (08) :2072-2080
[9]   Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting [J].
Desper, R ;
Gascuel, O .
MOLECULAR BIOLOGY AND EVOLUTION, 2004, 21 (03) :587-598
[10]   Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle [J].
Desper, R ;
Gascuel, O .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (05) :687-705