Genome-scale coestimation of species and gene trees

被引:159
作者
Boussau, Bastien [1 ,2 ]
Szoellosi, Gergely J. [1 ]
Duret, Laurent [1 ]
Gouy, Manolo [1 ]
Tannier, Eric [1 ,3 ]
Daubin, Vincent [1 ]
机构
[1] Univ Lyon 1, CNRS, Lab Biometrie & Biol Evolut, UMR 5558, F-69622 Villeurbanne, France
[2] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
[3] INRIA Rhone Alpes, F-38322 Montbonnot St Martin, France
关键词
INTERORDINAL RELATIONSHIPS; DUPLICATION; RECONCILIATION; PHYLOGENY; EVOLUTION; RECONSTRUCTION; ALGORITHMS;
D O I
10.1101/gr.141978.112
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Comparisons of gene trees and species trees are key to understanding major processes of genome evolution such as gene duplication and loss. Because current methods to reconstruct phylogenies fail to model the two-way dependency between gene trees and the species tree, they often misrepresent gene and species histories. We present a new probabilistic model to jointly infer rooted species and gene trees for dozens of genomes and thousands of gene families. We use simulations to show that this method accurately infers the species tree and gene trees, is robust to misspecification of the models of sequence and gene family evolution, and provides a precise historic record of gene duplications and losses throughout genome evolution. We simultaneously reconstruct the history of mammalian species and their genes based on 36 completely sequenced genomes, and use the reconstructed gene trees to infer the gene content and organization of ancestral mammalian genomes. We show that our method yields a more accurate picture of ancestral genomes than the trees available in the authoritative database Ensembl.
引用
收藏
页码:323 / 330
页数:8
相关论文
共 40 条
[1]   Simultaneous Bayesian gene tree reconstruction and reconciliation analysis [J].
Akerborg, Oerjan ;
Sennblad, Bengt ;
Arvestad, Lars ;
Lagergren, Jens .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (14) :5714-5719
[2]   Bayesian gene/species tree reconciliation and orthology analysis using MCMC [J].
Arvestad, Lars ;
Berglund, Ann-Charlotte ;
Lagergren, Jens ;
Sennblad, Bengt .
BIOINFORMATICS, 2003, 19 :i7-i15
[3]   The multiple gene duplication problem revisited [J].
Bansal, Mukul S. ;
Eulenstein, Oliver .
BIOINFORMATICS, 2008, 24 (13) :I132-I138
[4]   Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models [J].
Bansal, Mukul S. ;
Burleigh, J. Gordon ;
Eulenstein, Oliver .
BMC BIOINFORMATICS, 2010, 11
[5]  
Chaudhary R, 2010, BMC BIOINFORMATICS, V11, DOI 10.1186/1471-2105-11-574
[6]  
Dubb L., 2005, THESIS U WASHINGTON
[7]   Tree pattern matching in phylogenetic trees:: automatic search for orthologs or paralogs in homologous gene sequence databases [J].
Dufayard, JF ;
Duret, L ;
Penel, S ;
Gouy, M ;
Rechenmann, F ;
Perrière, G .
BIOINFORMATICS, 2005, 21 (11) :2596-2603
[8]   Bio++:: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics [J].
Dutheil, Julien ;
Gaillard, Sylvain ;
Bazin, Eric ;
Glemin, Sylvain ;
Ranwez, Vincent ;
Galtier, Nicolas ;
Belkhir, Khalid .
BMC BIOINFORMATICS, 2006, 7 (1)
[9]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[10]  
Felsenstein J., 2003, Inferring phylogenies