Bayesian gene/species tree reconciliation and orthology analysis using MCMC

被引:105
作者
Arvestad, Lars [1 ,2 ]
Berglund, Ann-Charlotte [3 ,4 ]
Lagergren, Jens [3 ,4 ]
Sennblad, Bengt [1 ,2 ]
机构
[1] Karolinska Inst, SBC, SE-17177 Stockholm, Sweden
[2] Karolinska Inst, Ctr Genom & Bioinformat, SE-17177 Stockholm, Sweden
[3] KTH, SBC, SE-10044 Stockholm, Sweden
[4] KTH, Dept Numer Anal & Comp Sci, SE-10044 Stockholm, Sweden
关键词
D O I
10.1093/bioinformatics/btg1000
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Comparative genomics in general and orthology analysis in particular are becoming increasingly important parts of gene function prediction. Previously, orthology analysis and reconciliation has been performed only with respect to the parsimony model. This discards many plausible solutions and sometimes precludes finding the correct one. In many other areas in bioinformatics probabilistic models have proven to be both more realistic and powerful than parsimony models. For instance, they allow for assessing solution reliability and consideration of alternative solutions in a uniform way. There is also an added benefit in making model assumptions explicit and therefore making model comparisons possible. For orthology analysis, uncertainty has recently been addressed using parsimonious reconciliation combined with bootstrap techniques. However, until now no probabilistic methods have been available. Results: We introduce a probabilistic gene evolution model based on a birth-death process in which a gene tree evolves 'inside' a species tree. Based on this model, we develop a tool with the capacity to perform practical orthology analysis, based on Fitch's original definition, and more generally for reconciling pairs of gene and species trees. Our gene evolution model is biologically sound (Nei et al., 1997) and intuitively attractive. We develop a Bayesian analysis based on MCMC which facilitates approximation of an a posteriori distribution for reconciliations. That is, we can find the most probable reconciliations and estimate the probability of any reconciliation, given the observed gene tree. This also gives a way to estimate the probability that a pair of genes are orthologs. The main algorithmic contribution presented here consists of an algorithm for computing the likelihood of a given reconciliation. To the best of our knowledge, this is the first successful introduction of this type of probabilistic methods, which flourish in phylogeny analysis, into reconciliation and orthology analysis. The MCMC algorithm has been implemented and, although not yet being in its final form, tests show that it performs very well on synthetic as well as biological data. Using standard correspondences, our results carry over to allele trees as well as biogeography.
引用
收藏
页码:i7 / i15
页数:9
相关论文
共 19 条
  • [1] Mitogenomic analyses of eutherian relationships
    Arnason, U
    Janke, A
    [J]. CYTOGENETIC AND GENOME RESEARCH, 2002, 96 (1-4) : 20 - 32
  • [2] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkh121, 10.1093/nar/gkr1065]
  • [3] DISTINGUISHING HOMOLOGOUS FROM ANALOGOUS PROTEINS
    FITCH, WM
    [J]. SYSTEMATIC ZOOLOGY, 1970, 19 (02): : 99 - &
  • [4] Gilks W., 1995, Markov Chain Monte Carlo in Practice, DOI 10.1201/b14835
  • [5] FITTING THE GENE LINEAGE INTO ITS SPECIES LINEAGE, A PARSIMONY STRATEGY ILLUSTRATED BY CLADOGRAMS CONSTRUCTED FROM GLOBIN SEQUENCES
    GOODMAN, M
    CZELUSNIAK, J
    MOORE, GW
    ROMEROHERRERA, AE
    MATSUDA, G
    [J]. SYSTEMATIC ZOOLOGY, 1979, 28 (02): : 132 - 163
  • [6] Reconstruction of ancient molecular phylogeny
    Guigo, R
    Muchnik, I
    Smith, TF
    [J]. MOLECULAR PHYLOGENETICS AND EVOLUTION, 1996, 6 (02) : 189 - 213
  • [7] HALLETT M, 2000, 4 ANN RECOMB 00 TOK, P146
  • [8] Hallett MT, 2000, LECT NOTES COMPUT SC, V1974, P465
  • [9] Evolution - Bayesian inference of phylogeny and its impact on evolutionary biology
    Huelsenbeck, JP
    Ronquist, F
    Nielsen, R
    Bollback, JP
    [J]. SCIENCE, 2001, 294 (5550) : 2310 - 2314
  • [10] MRBAYES: Bayesian inference of phylogenetic trees
    Huelsenbeck, JP
    Ronquist, F
    [J]. BIOINFORMATICS, 2001, 17 (08) : 754 - 755