Testing congruence in phylogenomic analysis

被引:144
作者
Leigh, Jessica W. [1 ]
Susko, Edward [2 ]
Baumgartner, Manuela [3 ]
Roger, Andrew J. [1 ]
机构
[1] Dalhousie Univ, Dept Biochem & Mol Biol, Halifax, NS B3H 1X5, Canada
[2] Dalhousie Univ, Dept Math & Stat & Genome Atlantic, Halifax, NS B3H 1X5, Canada
[3] Univ Munich, Dept Biol Bot 1, D-80638 Munich, Germany
基金
加拿大健康研究院; 加拿大自然科学与工程研究理事会;
关键词
concatenated analysis; endosymbiotic gene transfer; hierarchical clustering; lateral gene transfer; likelihood ratio testing; maximum likelihood; phylogenetic congruence; phylogenomics; separate analysis; superkingdom; supermatrix analysis;
D O I
10.1080/10635150801910436
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phylogenomic analyses of large sets of genes or proteins have the potential to revolutionize our understanding of the tree of life. However, problems arise because estimated phylogenies from individual loci often differ because of different histories, systematic bias, or stochastic error. We have developed CONCATERPILLAR, a hierarchical clustering method based on likelihood-ratio testing that identifies congruent loci for phylogenomic analysis. CONCATERPILLAR also includes a test for shared relative evolutionary rates between genes indicating whether they should be analyzed separately or by concatenation. In simulation studies, the performance of this method is excellent when a multiple comparison correction is applied. We analyzed a phylogenomic data set of 60 translational protein sequences from the major supergroups of eukaryotes and identified three congruent subsets of proteins. Analysis of the largest set indicates improved congruence relative to the full data set and produced a phylogeny with stronger support for five eukaryote supergroups including the Opisthokonts, the Plantae, the stramenopiles + Apicomplexa (chromalveolates), the Amoebozoa, and the Excavata. In contrast, the phylogeny of the second largest set indicates a close relationship between stramenopiles and red algae, to the exclusion of alveolates, suggesting gene transfer from the red algal secondary symbiont to the ancestral stramenopile host nucleus during the origin of their chloroplast. Investigating phylogenomic data sets for conflicting signals has the potential to both improve phylogenetic accuracy and inform our understanding of genome evolution.
引用
收藏
页码:104 / 115
页数:12
相关论文
共 50 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
Ané C, 2007, MOL BIOL EVOL, V24, P412
[3]   A kingdom-level phylogeny of eukaryotes based on combined protein data [J].
Baldauf, SL ;
Roger, AJ ;
Wenk-Siefert, I ;
Doolittle, WF .
SCIENCE, 2000, 290 (5493) :972-977
[4]   Do orthologous gene phylogenies really support tree-thinking? [J].
Bapteste, E ;
Susko, E ;
Leigh, J ;
MacLeod, D ;
Charlebois, RL ;
Doolittle, WF .
BMC EVOLUTIONARY BIOLOGY, 2005, 5 (1)
[5]   The analysis of 100 genes supports the grouping of three highly divergent amoebae:: Dictyostelium, Entamoeba, and Mastigamoeba [J].
Bapteste, E ;
Brinkmann, H ;
Lee, JA ;
Moore, DV ;
Sensen, CW ;
Gordon, P ;
Duruflé, L ;
Gaasterland, T ;
Lopez, P ;
Müller, M ;
Philippe, H .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (03) :1414-1419
[6]   The utility of the incongruence length difference test [J].
Barker, FK ;
Lutzoni, FM .
SYSTEMATIC BIOLOGY, 2002, 51 (04) :625-637
[7]   Highways of gene sharing in prokaryotes [J].
Beiko, RG ;
Harlow, TJ ;
Ragan, MA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (40) :14332-14337
[8]   Calculating the evolutionary rates of different genes: A fast, accurate estimator with applications to maximum likelihood phylogenetic analysis [J].
Bevan, RB ;
Lang, BF ;
Bryant, D .
SYSTEMATIC BIOLOGY, 2005, 54 (06) :900-915
[9]   An emerging phylogenetic core of Archaea: phylogenies of transcription and translation machineries converge following addition of new genome sequences [J].
Brochier, C ;
Forterre, P ;
Gribaldo, S .
BMC EVOLUTIONARY BIOLOGY, 2005, 5 (1)
[10]   Eubacterial phylogeny based on translational apparatus proteins [J].
Brochier, C ;
Bapteste, E ;
Moreira, D ;
Philippe, H .
TRENDS IN GENETICS, 2002, 18 (01) :1-5