Groves of Phylogenetic Trees

被引:7
作者
Ane, Cecile [1 ,2 ]
Eulenstein, Oliver [3 ]
Piaggio-Talice, Raul [3 ]
Sanderson, Michael J. [4 ]
机构
[1] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
[2] Univ Wisconsin, Dept Bot, Madison, WI 53706 USA
[3] Iowa State Univ, Dept Comp Sci, Ames, IA 50011 USA
[4] Univ Arizona, Dept Ecol & Evolutionary Biol, Tucson, AZ 85721 USA
基金
美国国家科学基金会;
关键词
supertree; supermatrix; triplets; clustering; evolution; SUPERTREES; LIFE;
D O I
10.1007/s00026-009-0017-x
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
A major challenge in biological sciences is the reconstruction of the Tree of Life. To this effect, large genomic databases like GenBank and SwissProt are being mined for clusters from which phylogenies can be inferred. Systematists and comparative biologists commonly combine such phylogenies into informative supertrees that reveal information which was not explicitly displayed in any of the original phylogenies. However, whether a supertree is informative depends on particular overlap properties among the clusters from which it originates. In this work we formally introduce the concept of groves - sets of clusters with the potential to construct informative supertrees. Thus maximal potential candidate clusters for informative supertree construction can be identified in large databases through groves, prior to inferring trees for each cluster. Groves also have the potential to lead to informative supermatrix construction. We developed methods that (i) efficiently identify particular types of groves and (ii) find lower and upper bounds on the minimal number of groves needed to cover all the trees or data sets in a database. Finally, we apply our methods to the green plant sequences from GenBank.
引用
收藏
页码:139 / 167
页数:29
相关论文
共 21 条
[1]   INFERRING A TREE FROM LOWEST COMMON ANCESTORS WITH AN APPLICATION TO THE OPTIMIZATION OF RELATIONAL EXPRESSIONS [J].
AHO, AV ;
SAGIV, Y ;
SZYMANSKI, TG ;
ULLMAN, JD .
SIAM JOURNAL ON COMPUTING, 1981, 10 (03) :405-421
[2]  
Bininda-Emonds OlafR. P., 2004, Phylogenetic supertrees: Combining information to reveal the tree of life, V4
[3]   The (Super)tree of life: Procedures, problems, and prospects [J].
Bininda-Emonds, ORP ;
Gittleman, JL ;
Steel, MA .
ANNUAL REVIEW OF ECOLOGY AND SYSTEMATICS, 2002, 33 :265-289
[4]   Extension operations on sets of leaf-labelled trees [J].
Bryant, D ;
Steel, M .
ADVANCES IN APPLIED MATHEMATICS, 1995, 16 (04) :425-453
[5]   Discriminating supported and unsupported relationships in supertrees using triplets [J].
Cotton, JA ;
Slater, CSC ;
Wilkinson, M .
SYSTEMATIC BIOLOGY, 2006, 55 (02) :345-350
[6]   Darwin's abominable mystery: Insights from a supertree of the angiosperms [J].
Davies, TJ ;
Barraclough, TG ;
Chase, MW ;
Soltis, PS ;
Soltis, DE ;
Savolainen, V .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (07) :1904-1909
[7]   Prospects for building the tree of life from large sequence databases [J].
Driskell, AC ;
Ané, C ;
Burleigh, JG ;
McMahon, MM ;
O'Meara, BC ;
Sanderson, MJ .
SCIENCE, 2004, 306 (5699) :1172-1174
[8]  
Foulds L.R., 1982, Advances in Applied Mathematics, V3, P43, DOI [10.1016/S0196-8858(82)80004-3, DOI 10.1016/S0196-8858(82)80004-3]
[9]  
Hall Jr M., 1986, Wiley-Interscience Series in Discrete Mathematics, Vsecond
[10]  
Kennedy M, 2002, AUK, V119, P88, DOI 10.1642/0004-8038(2002)119[0088:SSCPEO]2.0.CO