Provably fast and accurate recovery of evolutionary trees through harmonic greedy triplets

被引:13
作者
Csurös, M [1 ]
Kao, MY [1 ]
机构
[1] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
关键词
evolutionary trees; the Jukes-Cantor model of evolution; computational learning; harmonic greedy triplets;
D O I
10.1137/S009753970037905X
中图分类号
TP301 [理论、方法];
学科分类号
081202 [计算机软件与理论];
摘要
We give a greedy learning algorithm for reconstructing an evolutionary tree based on a certain harmonic average on triplets of terminal taxa. After the pairwise distances between terminal taxa are estimated from sequence data, the algorithm runs in O(n(2)) time using O(n) work space, where n is the number of terminal taxa. These time and space complexities are optimal in the sense that the size of an input distance matrix is n(2) and the size of an output tree is n. Moreover, in the Jukes-Cantor model of evolution, the algorithm recovers the correct tree topology with high probability using sample sequences of length polynomial in (1) n, (2) the logarithm of the error probability, and (3) the inverses of two small parameters.
引用
收藏
页码:306 / 322
页数:17
相关论文
共 23 条
[1]
On the approximability of numerical taxonomy (fitting distances by tree metrics) [J].
Agarwala, R ;
Bafna, V ;
Farach, M ;
Paterson, M ;
Thorup, M .
SIAM JOURNAL ON COMPUTING, 1999, 28 (03) :1073-1085
[2]
Nearly tight bounds on the learnability of evolution [J].
Ambainis, A ;
Desper, R ;
Farach, M ;
Kannan, S .
38TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 1997, :524-533
[3]
[Anonymous], THESIS YALE U
[4]
The performance of neighbor-joining methods of phylogenetic reconstruction [J].
Atteson, K .
ALGORITHMICA, 1999, 25 (2-3) :251-278
[5]
Evolutionary trees can be learned in polynomial time in the two-state general Markov model [J].
Cryan, M ;
Goldberg, LA ;
Goldberg, PW .
39TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 1998, :436-445
[6]
Csürös M, 1999, PROCEEDINGS OF THE TENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P261
[8]
THE COMPUTATIONAL-COMPLEXITY OF INFERRING ROOTED PHYLOGENIES BY PARSIMONY [J].
DAY, WHE ;
JOHNSON, DS ;
SANKOFF, D .
MATHEMATICAL BIOSCIENCES, 1986, 81 (01) :33-42
[9]
DU DZ, 1991, PROCEEDINGS - 32ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, P431, DOI 10.1109/SFCS.1991.185402
[10]
Erdos PL, 1999, RANDOM STRUCT ALGOR, V14, P153, DOI 10.1002/(SICI)1098-2418(199903)14:2<153::AID-RSA3>3.0.CO