Addressing Inter-Gene Heterogeneity in Maximum Likelihood Phylogenomic Analysis: Yeasts Revisited

被引:23
作者
Hess, Jaqueline [1 ,2 ]
Goldman, Nick [2 ]
机构
[1] Harvard Univ, Dept Organism & Evolutionary Biol, Cambridge, MA 02138 USA
[2] EMBL European Bioinformat Inst, Hinxton, England
来源
PLOS ONE | 2011年 / 6卷 / 08期
基金
英国惠康基金; 英国生物技术与生命科学研究理事会;
关键词
INFORMATION CRITERION; MODEL SELECTION; CUG CODON; PROTEIN; EVOLUTION; SCALE; TREE; PHYLOGENETICS; INFERENCE; CANDIDA;
D O I
10.1371/journal.pone.0022783
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Phylogenomic approaches to the resolution of inter-species relationships have become well established in recent years. Often these involve concatenation of many orthologous genes found in the respective genomes followed by analysis using standard phylogenetic models. Genome-scale data promise increased resolution by minimising sampling error, yet are associated with well-known but often inappropriately addressed caveats arising through data heterogeneity and model violation. These can lead to the reconstruction of highly-supported but incorrect topologies. With the aim of obtaining a species tree for 18 species within the ascomycetous yeasts, we have investigated the use of appropriate evolutionary models to address inter-gene heterogeneities and the scalability and validity of supermatrix analysis as the phylogenetic problem becomes more difficult and the number of genes analysed approaches truly phylogenomic dimensions. We have extended a widely-known early phylogenomic study of yeasts by adding additional species to increase diversity and augmenting the number of genes under analysis. We have investigated sophisticated maximum likelihood analyses, considering not only a concatenated version of the data but also partitioned models where each gene constitutes a partition and parameters are free to vary between the different partitions (thereby accounting for variation in the evolutionary processes at different loci). We find considerable increases in likelihood using these complex models, arguing for the need for appropriate models when analyzing phylogenomic data. Using these methods, we were able to reconstruct a well-supported tree for 18 ascomycetous yeasts spanning about 250 million years of evolution.
引用
收藏
页数:12
相关论文
共 69 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
BOFKIN L, 2005, THESIS U CAMBRIDGE
[4]   An empirical assessment of long-branch attraction artefacts in deep eukaryotic phylogenomics [J].
Brinkmann, H ;
Van der Giezen, M ;
Zhou, Y ;
De Raucourt, GP ;
Philippe, H .
SYSTEMATIC BIOLOGY, 2005, 54 (05) :743-757
[5]   Multimodel inference - understanding AIC and BIC in model selection [J].
Burnham, KP ;
Anderson, DR .
SOCIOLOGICAL METHODS & RESEARCH, 2004, 33 (02) :261-304
[6]   Visualizing syntenic relationships among the hemiascomycetes with the Yeast Gene Order Browser [J].
Byrne, Kevin P. ;
Wolfe, Kenneth H. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :D452-D455
[7]   Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis [J].
Castresana, J .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (04) :540-552
[8]   Comparative genome analysis across a kingdom of eukaryotic organisms: Specialization and diversification in the Fungi [J].
Cornell, Michael J. ;
Alam, Intikhab ;
Soanes, Darren M. ;
Wong, Han Min ;
Hedeler, Cornelia ;
Paton, Norman W. ;
Rattray, Magnus ;
Hubbard, Simon J. ;
Talbot, Nicholas J. ;
Oliver, Stephen G. .
GENOME RESEARCH, 2007, 17 (12) :1809-1822
[9]   Gene tree discordance, phylogenetic inference and the multispecies coalescent [J].
Degnan, James H. ;
Rosenberg, Noah A. .
TRENDS IN ECOLOGY & EVOLUTION, 2009, 24 (06) :332-340
[10]  
Delsuc F, 2003, SCIENCE, V301, P1482