Finding a maximum likelihood tree is hard

被引:31
作者
Chor, Benny [1 ]
Tuller, Tamir [1 ]
机构
[1] Tel Aviv Univ, IL-69978 Tel Aviv, Israel
关键词
theory; algorithms; maximum likelihood; tree reconstruction; maximum parsimony; intractability; approximate vertex cover;
D O I
10.1145/1183907.1183909
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Maximum likelihood (ML) is an increasingly popular optimality criterion for selecting evolutionary trees [Felsenstein 1981]. Finding optimal ML trees appears to be a very hard computational task, but for tractable cases, ML is the method of choice. In particular, algorithms and heuristics for ML take longer to run than algorithms and heuristics for the second major character based criterion, maximum parsimony (MP). However, while MP has been known to be NP-complete for over 20 years [Foulds and Graham, 1982; Day et al. 1986], such a hardness result for ML has so far eluded researchers in the field. An important work by Tuffley and Steel [1997] proves quantitative relations between the parsimony values of given sequences and the corresponding log likelihood values. However, a direct application of their work would only give an exponential time reduction from MP to ML. Another step in this direction has recently been made by Addario-Berry et al. [2004], who proved that ancestral maximum likelihood (AML) is NP-complete. AML "lies in between" the two problems, having some properties of MP and some properties of ML. Still, the AML proof is not directly applicable to the ML problem. We resolve the question, showing that "regular" ML on phylogenetic trees is indeed intractable. Our reduction follows the vertex cover reductions for MP [Day et al. 1986] and AML [Addario-Berry et al. 2004], but its starting point is an approximation version of vertex cover, known as GAP VC. The crux of our work is not the reduction, but its correctness proof. The proof goes through a series of tree modifications, while controlling the likelihood losses at each step, using the bounds of Tuffley and Steel [1997]. The proof can be viewed as correlating the value of any ML solution to an arbitrarily close approximation to vertex cover.
引用
收藏
页码:722 / 744
页数:23
相关论文
共 26 条
[1]  
Addario-Berry Louigi, 2004, J Bioinform Comput Biol, V2, P257, DOI 10.1142/S0219720004000557
[2]  
[Anonymous], 1971, STAT DECISION THEORY
[3]  
BERMAN P, 1999, LECT NOTES COMPUTER
[4]   Multiple maxima of likelihood in phylogenetic trees: An analytic approach [J].
Chor, B ;
Hendy, MD ;
Holland, BR ;
Penny, D .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (10) :1529-1541
[5]   Maximum likelihood of evolutionary trees: hardness and approximation [J].
Chor, B ;
Tuller, T .
BIOINFORMATICS, 2005, 21 :I97-I106
[7]   THE COMPUTATIONAL-COMPLEXITY OF INFERRING ROOTED PHYLOGENIES BY PARSIMONY [J].
DAY, WHE ;
JOHNSON, DS ;
SANKOFF, D .
MATHEMATICAL BIOSCIENCES, 1986, 81 (01) :33-42
[8]   COMPUTATIONAL-COMPLEXITY OF INFERRING PHYLOGENIES BY COMPATIBILITY [J].
DAY, WHE ;
SANKOFF, D .
SYSTEMATIC ZOOLOGY, 1986, 35 (02) :224-229
[9]   On the hardness of approximating minimum vertex cover [J].
Dinur, I ;
Safra, S .
ANNALS OF MATHEMATICS, 2005, 162 (01) :439-485
[10]  
Edwards A., 1972, LIKELIHOOD