An efficient algorithm for statistical multiple alignment on arbitrary phylogenetic trees

被引:25
作者
Lunter, GA [1 ]
Miklós, I [1 ]
Song, YS [1 ]
Hein, J [1 ]
机构
[1] Univ Oxford, Dept Stat, Oxford OX1 3TG, England
关键词
statistical alignment; multiple alignment; phylogeny; maximum likelihood;
D O I
10.1089/106652703322756122
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We present an efficient algorithm for statistical multiple alignment based on the TKF91 model of Thorne, Kishino, and Felsenstein (1991) on an arbitrary k-leaved phylogenetic tree. The existing algorithms use a hidden Markov model approach, which requires at least O(root5(k)) states and leads to a time complexity of O(5(k) L-k), where L is the geometric mean sequence length. Using a combinatorial technique reminiscent of inclusion/exclusion, we are able to sum away the states, thus improving the time complexity to O(2(k) L-k) and considerably reducing memory requirements. This makes statistical multiple alignment under the TKF91 model a definite practical possibility in the case of a phylogenetic tree with a modest number of leaves.
引用
收藏
页码:869 / 889
页数:21
相关论文
共 23 条
[1]   Sorting permutations by block-interchanges [J].
Christie, DA .
INFORMATION PROCESSING LETTERS, 1996, 60 (04) :165-169
[2]  
Durbin R., 1998, BIOL SEQUENCE ANAL
[3]  
Eddy S, 2001, HMMER PROFILE HIDDEN
[4]   EVOLUTIONARY TREES FROM DNA-SEQUENCES - A MAXIMUM-LIKELIHOOD APPROACH [J].
FELSENSTEIN, J .
JOURNAL OF MOLECULAR EVOLUTION, 1981, 17 (06) :368-376
[5]  
Felsenstein J., 2001, PHYLIP PHYLOGENY INF
[6]  
Goldman N, 1998, BIOESSAYS, V20, P287, DOI 10.1002/(SICI)1521-1878(199804)20:4<287::AID-BIES4>3.0.CO
[7]  
2-N
[8]   Statistical alignment: Computational properties, homology testing and goodness-of-fit [J].
Hein, J ;
Wiuf, C ;
Knudsen, B ;
Moller, MB ;
Wibling, G .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 302 (01) :265-279
[9]  
Hein J, 2001, Pac Symp Biocomput, P179
[10]  
HEIN J, 2002, 42K U AARH DEP THEOR