Class of Multiple Sequence Alignment Algorithm Affects Genomic Analysis

被引:47
作者
Blackburne, Benjamin P. [1 ]
Whelan, Simon [1 ]
机构
[1] Univ Manchester, Fac Life Sci, Manchester, Lancs, England
基金
英国生物技术与生命科学研究理事会;
关键词
sequence analysis; multiple sequence alignment; phylogenetics; adaptive evolution; comparative genomics; JOINT BAYESIAN-ESTIMATION; ADAPTIVE EVOLUTION; CLUSTAL-W; SELECTION; DISTANCE; ERRORS; GENES; SITES; MODEL;
D O I
10.1093/molbev/mss256
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignment (MSA) is the heart of comparative sequence analysis. Recent studies demonstrate that MSA algorithms can produce different outcomes when analyzing genomes, including phylogenetic tree inference and the detection of adaptive evolution. These studies also suggest that the difference between MSA algorithms is of a similar order to the uncertainty within an algorithm and suggest integrating across this uncertainty. In this study, we examine further the problem of disagreements between MSA algorithms and how they affect downstream analyses. We also investigate whether integrating across alignment uncertainty affects downstream analyses. We address these questions by analyzing 200 chordate gene families, with properties reflecting those used in large-scale genomic analyses. We find that newly developed distance metrics reveal two significantly different classes of MSA methods (MSAMs). The similarity-based class includes progressive aligners and consistency aligners, representing many methodological innovations for sequence alignment, whereas the evolution-based class includes phylogenetically aware alignment and statistical alignment. We proceed to show that the class of an MSAM has a substantial impact on downstream analyses. For phylogenetic inference, tree estimates and their branch lengths appear highly dependent on the class of aligner used. The number of families, and the sites within those families, inferred to have undergone adaptive evolution depend on the class of aligner used. Similarity-based aligners tend to identify more adaptive evolution. We also develop and test methods for incorporating MSA uncertainty when detecting adaptive evolution but find that although accounting for MSA uncertainty does affect downstream analyses, it appears less important than the class of aligner chosen. Our results demonstrate the critical role that MSA methodology has on downstream analysis, highlighting that the class of aligner chosen in an analysis has a demonstrable effect on its outcome.
引用
收藏
页码:642 / 653
页数:12
相关论文
共 55 条
  • [1] Human-Specific Evolution and Adaptation Led to Major Qualitative Differences in the Variable Receptors of Human and Chimpanzee Natural Killer Cells
    Abi-Rached, Laurent
    Moesta, Achim K.
    Rajalingam, Raja
    Guethlein, Lisbeth A.
    Parham, Peter
    [J]. PLOS GENETICS, 2010, 6 (11)
  • [2] Accuracy and power of the likelihood ratio test in detecting adaptive molecular evolution
    Anisimova, M
    Bielawski, JP
    Yang, ZH
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (08) : 1585 - 1592
  • [3] Anisimova M., 2010, Trends in Evolutionary Biol, V2, pe7, DOI [DOI 10.4081/eb.2010.e7, DOI 10.4081/EB.2010.E7]
  • [4] [Anonymous], 2011, R: A Language and Environment for Statistical Computing
  • [5] [Anonymous], LIKELIHOOD
  • [6] The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling
    Arnold, K
    Bordoli, L
    Kopp, J
    Schwede, T
    [J]. BIOINFORMATICS, 2006, 22 (02) : 195 - 201
  • [7] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
  • [8] Measuring the distance between multiple sequence alignments
    Blackburne, Benjamin P.
    Whelan, Simon
    [J]. BIOINFORMATICS, 2012, 28 (04) : 495 - 502
  • [9] Integration of evolutionary and desolvation energy analysis identifies functional sites in a plant immunity protein
    Casasoli, Manuela
    Federici, Luca
    Spinelli, Francesco
    Di Matteo, Adele
    Vella, Nicoletta
    Scaloni, Flavio
    Fernandez-Recio, Juan
    Cervone, Felice
    De Lorenzo, Giulia
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (18) : 7666 - 7671
  • [10] Phylogenetic assessment of alignments reveals neglected tree signal in gaps
    Dessimoz, Christophe
    Gil, Manuel
    [J]. GENOME BIOLOGY, 2010, 11 (04):