Topology testing of phylogenies using least squares methods

被引:8
作者
Czarna, Aleksandra
Sanjuan, Rafael
Gonzalez-Candelas, Fernando
Wrobel, Borys
机构
[1] Polish Acad Sci, Inst Oceanol, Dept Marine Genet & Biotechnol, PL-81712 Sopot, Poland
[2] Univ Politecn Valencia, CSIC, Inst Biol Mol & Celular Plantas, E-46071 Valencia, Spain
[3] Univ Valencia, Inst Cavanilles Biodivers & Biol Evolut, Valencia, Spain
关键词
D O I
10.1186/1471-2148-6-105
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The least squares (LS) method for constructing confidence sets of trees is closely related to LS tree building methods, in which the goodness of fit of the distances measured on the tree (patristic distances) to the observed distances between taxa is the criterion used for selecting the best topology. The generalized LS (GLS) method for topology testing is often frustrated by the computational difficulties in calculating the covariance matrix and its inverse, which in practice requires approximations. The weighted LS (WLS) allows for a more efficient albeit approximate calculation of the test statistic by ignoring the covariances between the distances. Results: The goal of this paper is to assess the applicability of the LS approach for constructing confidence sets of trees. We show that the approximations inherent to the WLS method did not affect negatively the accuracy and reliability of the test both in the analysis of biological sequences and DNA-DNA hybridization data (for which character-based testing methods cannot be used). On the other hand, we report several problems for the GLS method, at least for the available implementation. For many data sets of biological sequences, the GLS statistic could not be calculated. For some data sets for which it could, the GLS method included all the possible trees in the confidence set despite a strong phylogenetic signal in the data. Finally, contrary to WLS, for simulated sequences GLS showed undercoverage (frequent non-inclusion of the true tree in the confidence set). Conclusion: The WLS method provides a computationally efficient approximation to the GLS useful especially in exploratory analyses of confidence sets of trees, when assessing the phylogenetic signal in the data, and when other methods are not available.
引用
收藏
页数:13
相关论文
共 42 条
[1]  
Adachi J, 1996, J MOL EVOL, V42, P459
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]  
[Anonymous], MOL SYSTEMATICS
[4]   Some difficulties of interpretation encountered in the application of the chi-square test [J].
Berkson, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1938, 33 (203) :526-536
[5]   Molecular epidemiology of a Hepatitis C virus outbreak in a hemodialysis unit [J].
Bracho, MA ;
Gosalbes, MJ ;
Blasco, D ;
Moya, A ;
González-Candelas, F .
JOURNAL OF CLINICAL MICROBIOLOGY, 2005, 43 (06) :2750-2755
[6]  
BULMER M, 1991, MOL BIOL EVOL, V8, P868
[7]   Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis [J].
Castresana, J .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (04) :540-552
[8]   PHYLOGENETIC ANALYSIS - MODELS AND ESTIMATION PROCEDURES [J].
CAVALLISFORZA, LL ;
EDWARDS, AWF .
EVOLUTION, 1967, 21 (03) :550-+
[9]  
Dayhoff M. O., 1978, ATLAS PROTEIN SEQUEN, P345
[10]   Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting [J].
Desper, R ;
Gascuel, O .
MOLECULAR BIOLOGY AND EVOLUTION, 2004, 21 (03) :587-598