EVALUATION OF SEVERAL METHODS FOR ESTIMATING PHYLOGENETIC TREES WHEN SUBSTITUTION RATES DIFFER OVER NUCLEOTIDE SITES

被引:39
作者
YANG, ZH [1 ]
机构
[1] NAT HIST MUSEUM, DEPT ZOOL, LONDON SW7 5BD, ENGLAND
关键词
PHYLOGENY; MAXIMUM LIKELIHOOD; LEAST SQUARES; CONSISTENCY; SAMPLING ERROR; RATE VARIATION AT SITES; GAMMA DISTRIBUTION; COMPUTER SIMULATION;
D O I
10.1007/BF00160518
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Several maximum likelihood and distance matrix methods for estimating phylogenetic trees from homologous DNA sequences were compared when substitution rates at sites were assumed to follow a gamma distribution. Computer simulations were performed to estimate the probabilities that various tree estimation methods recover the true tree topology. The case of four species was considered, and a few combinations of parameters were examined. Attention was applied to discriminating among different sources of error in tree reconstruction, i.e., the inconsistency of the tree estimation method, the sampling error in the estimated tree due to limited sequence length, and the sampling error in the estimated probability due to the number of simulations being Limited. Compared to the least squares method based on pairwise distance estimates, the joint likelihood analysis is found to be more robust when rate variation over sites is present but ignored and an assumption is thus violated. With limited data, the likelihood method has a much higher probability of recovering the true tree and is therefore more efficient than the least squares method. The concept of statistical consistency of a tree estimation method and its implications were explored, and it is suggested that, while the efficiency (or sampling error) of a tree estimation method is a very important property, statistical consistency of the method over a wide range of, if not all, parameter values is prerequisite.
引用
收藏
页码:689 / 697
页数:9
相关论文
共 29 条
[1]  
Cavalli-Sforza L.L., Edwards A.W.F., Phylogenetic analysis: models and estimation procedures, Evolution, 32, pp. 550-570, (1967)
[2]  
Debry R.W., The consistency of several phylogeny-inference methods under varying evolutionary rates, Mol Biol Evol, 9, pp. 537-551, (1992)
[3]  
Felsenstein J., Cases in which parsimony and compatibility methods will be positively misleading, Syst Zool, 27, pp. 401-410, (1978)
[4]  
Felsenstein J., Evolutionary trees from DNA sequences: a maximum likelihood approach, J Mol Evol, 17, pp. 368-376, (1981)
[5]  
Fukami-Kobayashi K., Tateno Y., Robustness of maximum likelihood tree estimation against different patterns of base substitution, J Mol Evol, 32, pp. 79-91, (1991)
[6]  
Gillespie J.H., Rates of molecular evolution, Ann Rev Ecol Syst, 17, pp. 65-637, (1986)
[7]  
Hasegawa M., Kishino H., Confidence limits on the maximum likelihood estimation of the hominoid tree from mitochondrial DNA sequences, Evolution, 43, pp. 672-677, (1989)
[8]  
Hasegawa M., Yano T., Maximum likelihood method of phylogenetic inference from DNA sequence data, Bull Biometric Soc Jpn, 5, pp. 1-7, (1984)
[9]  
Hasegawa M., Kishino H., Saitou N., On the maximum likelihood method in molecular phylogenetics, J Mol Evol, 32, pp. 443-445, (1991)
[10]  
Hasegawa M., Rienzo A.D., Kocher T.D., Wilson A.C., Toward a more accurate time scale for the human mitochondrial DNA tree, J Mol Evol, 37, pp. 347-354, (1993)