Efficient reconstruction of phylogenetic networks with constrained recombination

被引:56
作者
Gusfield, D [1 ]
Eddhu, S [1 ]
Langley, C [1 ]
机构
[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
来源
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE | 2003年
关键词
D O I
10.1109/CSB.2003.1227337
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A phylogenetic network is a generalization of a phylogenetic tree, allowing structural properties that are not treelike. With the growth of genomic data, much of which does not fit ideal tree models, there is greater need to understand the algorithmics and combinatorics of phylogenetic networks [10, 11]. However, to date, very little has been published on this, with the notable exception of the paper by Wang et al.[12]. Other related papers include [4, 5, 7] We consider the problem introduced in [12], of determining whether the sequences can be derived on a phylogenetic network where the recombination cycles are node disjoint. In this paper, we call such a phylogenetic network a "galled-tree". By more deeply analysing the combinatorial constraints on cycle-disjoint phylogenetic networks, we obtain an efficient algorithm that is guaranteed to be both a necessary and sufficient test for the existence of a galled tree for the data. If there is a galled-tree, the algorithm constructs one and obtains an implicit representation of all the galled trees for the data, and can create these in linear time for each one. We also note two additional results related to galled trees: first, any set of sequences that can be derived on a galled tree can be derived on a true tree (without recombination cycles), where at most one back mutation is allowed per site; second, the site compatibility problem (which is NP-hard in general) can be solved in linear time for any set of sequences that can be derived on a galled tree. The combinatorial constraints we develop apply (for the most part) to node-disjoint cycles in any phylogenetic network (not just galled-trees), and can be used for example to prove that a given site cannot be on a node-disjoint cycle in any phylogenetic network. Perhaps more important than the specific results about galled-trees, we introduce an approach that can be used to study recombination in phylogenetic networks that go beyond galled-trees.
引用
收藏
页码:363 / 374
页数:12
相关论文
共 12 条
[1]   COMPUTATIONAL-COMPLEXITY OF INFERRING PHYLOGENIES BY COMPATIBILITY [J].
DAY, WHE ;
SANKOFF, D .
SYSTEMATIC ZOOLOGY, 1986, 35 (02) :224-229
[2]   EFFICIENT ALGORITHMS FOR INFERRING EVOLUTIONARY TREES [J].
GUSFIELD, D .
NETWORKS, 1991, 21 (01) :19-28
[3]  
Gusfield D, 1997, ALGORITHMS STRINGS T
[4]   RECONSTRUCTING EVOLUTION OF SEQUENCES SUBJECT TO RECOMBINATION USING PARSIMONY [J].
HEIN, J .
MATHEMATICAL BIOSCIENCES, 1990, 98 (02) :185-200
[5]  
HEIN J, 1993, J MOL EVOL, V36, P396, DOI 10.1007/BF00182187
[6]  
HUDSON RR, 1985, GENETICS, V111, P147
[7]   Reconstructing a history of recombinations from a set of sequences [J].
Kececioglu, J ;
Gusfield, D .
DISCRETE APPLIED MATHEMATICS, 1998, 88 (1-3) :239-260
[8]   Heterogeneous geographic patterns of nucleotide sequence diversity between two alcohol dehydrogenase genes in wild barley (Hordeum vulgare subspecies spontaneum) [J].
Lin, JZ ;
Brown, AHD ;
Clegg, MT .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (02) :531-536
[9]  
Myers SR, 2003, GENETICS, V163, P375
[10]   Intraspecific gene genealogies: trees grafting into networks [J].
Posada, D ;
Crandall, KA .
TRENDS IN ECOLOGY & EVOLUTION, 2001, 16 (01) :37-45