Inferring and Validating Horizontal Gene Transfer Events Using Bipartition Dissimilarity

被引:60
作者
Boc, Alix [1 ]
Philippe, Herve [2 ]
Makarenkov, Vladimir [1 ]
机构
[1] Univ Quebec, Dept Informat, Montreal, PQ H3C 3P8, Canada
[2] Univ Montreal, Dept Biochim, Fac Med, Montreal, PQ H3C 3J7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Bipartition dissimilarity; bootstrap analysis; horizontal gene transfer; least squares; phylogenetic tree; quartet distance; Robinson and Foulds topological distance; HISTORICAL ASSOCIATIONS; PHYLOGENETIC NETWORKS; MAXIMUM-LIKELIHOOD; SEQUENCE EVOLUTION; TREES; GENOMES; RATES; SUBSTITUTION; ALGORITHMS; SIMULATION;
D O I
10.1093/sysbio/syp103
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Horizontal gene transfer (HGT) is one of the main mechanisms driving the evolution of microorganisms. Its accurate identification is one of the major challenges posed by reticulate evolution. In this article, we describe a new polynomial-time algorithm for inferring HGT events and compare 3 existing and 1 new tree comparison indices in the context of HGT identification. The proposed algorithm can rely on different optimization criteria, including least squares (LS), Robinson and Foulds (RF) distance, quartet distance (QD), and bipartition dissimilarity (BD), when searching for an optimal scenario of subtree prune and regraft (SPR) moves needed to transform the given species tree into the given gene tree. As the simulation results suggest, the algorithmic strategy based on BD, introduced in this article, generally provides better results than those based on LS, RF, and QD. The BD-based algorithm also proved to be more accurate and faster than a well-known polynomial time heuristic RIATA-HGT. Moreover, the HGT recovery results yielded by BD were generally equivalent to those provided by the exponential-time algorithm LatTrans, but a clear gain in running time was obtained using the new algorithm. Finally, a statistical framework for assessing the reliability of obtained HGTs by bootstrap analysis is also presented.
引用
收藏
页码:195 / 211
页数:17
相关论文
共 55 条
  • [1] Divergence and redundancy of 16S rRNA sequences in genomes with multiple rrn operons
    Acinas, SG
    Marcelino, LA
    Klepac-Ceraj, V
    Polz, MF
    [J]. JOURNAL OF BACTERIOLOGY, 2004, 186 (09) : 2629 - 2635
  • [2] Addario-Berry L, 2003, Pac Symp Biocomput, P279
  • [3] Allen CR, 2001, CONSERV ECOL, V5
  • [4] [Anonymous], 2005, PHYLIP (phylogeny inference package) version 3.6
  • [5] Phylogenetic identification of lateral genetic transfer events
    Beiko, RG
    Hamilton, N
    [J]. BMC EVOLUTIONARY BIOLOGY, 2006, 6 (1) : 17P
  • [6] Boc A, 2003, LECT N BIOINFORMAT, V2812, P190
  • [7] Bordewich M., 2005, Ann Comb, V8, P409, DOI [DOI 10.1007/S00026-004-0229-Z, 10.1007/s00026-004-0229-z]
  • [8] Consistency of Topological Moves Based on the Balanced Minimum Evolution Principle of Phylogenetic Inference
    Bordewich, Magnus
    Gascuel, Olivier
    Huber, Katharina T.
    Moulton, Vincent
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2009, 6 (01) : 110 - 117
  • [9] Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis
    Castresana, J
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (04) : 540 - 552
  • [10] Jungles: a new solution to the host/parasite phylogeny reconciliation problem
    Charleston, MA
    [J]. MATHEMATICAL BIOSCIENCES, 1998, 149 (02) : 191 - 223