Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination

被引:43
作者
Gusfield, D [1 ]
机构
[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
基金
美国国家科学基金会;
关键词
molecular evolution; phylogenetic networks; perfect phylogeny; ancestral recombination graph; recombination; gene-conversion; SNP; recurrent mutation;
D O I
10.1016/j.jcss.2004.12.009
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Phylogenetic networks are models of sequence evolution that go beyond trees, allowing biological operations that are not consistent with tree-like evolution. One of the most important of these biological operations is (single-crossover) recombination between two sequences. An established problem (Math. Biosci. 98 (1990) 185; J. Mol. Evol. 36 (1993) 396; Proceedings of the 2003 Workshop on Algorithms in Bioinformatics, Berlin, Germany, Lecture Notes in Computer Science, Springer, Berlin, 2003; J. Math. Biol. 98 (2003) 160; J. Comput. Biol. 8 (2001) 69; Genetics 163 (2003) 375; Genetics 111 (1985) 147) is to find a phylogenetic network that derives an input set of sequences, minimizing the number of recombinations used. No efficient, general algorithm is known for that problem. An efficient algorithm does exist for the problem when the network is constrained to be a "galled-tree", and the ancestral sequence for the galled-tree is specified in advance (Proceedings of Second CSB Bioinformatics Conference, Los Alamitos, CA, 2003, IEEE Press, New York; J. Bioinform. Comput. Biol. 2(1) (2004) 173; INFORMS J. Comput. 16 (2004) 459). However, the more biologically realistic case is that no ancestral sequence is known in advance, and the only previous algorithmic solution for that case takes exponential time. In this paper we give an efficient solution to the galled-tree problem when no ancestral sequence is known in advance, and show that the solution produced has very strong global optimality properties. We also indicate how these results generalize to other complex biological phenomena such as gene-conversion, lateral gene transfer, hybrid speciation, and back and recurrent mutation. (c) 2005 Elsevier Inc. All rights reserved.
引用
收藏
页码:381 / 398
页数:18
相关论文
共 29 条
[1]  
[Anonymous], 1997, ALGORITHMS STRINGS T, DOI DOI 10.1017/CBO9780511574931
[2]   The number of recombination events in a sample history: Conflict graph and lower bounds [J].
Bafna, V ;
Bansal, V .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2004, 1 (02) :78-90
[3]   A note on efficient computation of haplotypes via perfect phylogeny [J].
Bafna, V ;
Gusfield, D ;
Hannenhalli, S ;
Yooseph, S .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2004, 11 (05) :858-866
[4]  
BERRY A, 1999, EVOL GENET, P102
[5]   It's raining SNPs, hallelujah? [J].
Chakravarti, A .
NATURE GENETICS, 1998, 19 (03) :216-217
[6]  
EDDHU S, COMMUNICATION
[7]  
Felsenstein Joseph, 2004, Inferring_phylogenies, V2
[8]   The fine structure of galls in phylogenetic networks [J].
Gusfield, D ;
Eddhu, S ;
Langley, C .
INFORMS JOURNAL ON COMPUTING, 2004, 16 (04) :459-469
[9]   EFFICIENT ALGORITHMS FOR INFERRING EVOLUTIONARY TREES [J].
GUSFIELD, D .
NETWORKS, 1991, 21 (01) :19-28
[10]  
GUSFIELD D, 2003, P 2 CBS BIO C LOS AN