Large-scale phylogenies and measuring the performance of phylogenetic estimators

被引:71
作者
Kim, J [1 ]
机构
[1] Yale Univ, Dept Biol, New Haven, CT 06511 USA
关键词
accuracy; consistency; efficiency; large-scale phylogeny; performance;
D O I
10.1080/106351598261021
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Performance measures of phylogenetic estimation methods such as accuracy, consistency, and power are an attempt at summarizing an ensemble of a given estimator's behavior. These summaries characterize an ensemble behavior with a single number, leading to a variety of definitions. In particular, the relationships between different performance measures such as accuracy and consistency or accuracy and error depend on the exact definition of these measures. In addition it is relatively common to use large-sample behavior to infer similar behavior for small samples. In fact, large-sample results such as the claimed asymptotic efficiency of the maximum-likelihood estimator are often uninformative for small samples. Conversely, small-sample behavior using simulations is sometimes used to imply large-sample behavior such as consistency. However, such extrapolation is often difficult. How the performance of a phylogenetic estimator scales with the addition of taxa must be qualified with respect to whether the whole tree is being estimated or a fixed subset of taxa is being estimated. It must also be qualified with respect to how tree models are sampled. Over the ensemble of all possible trees of a given size, the performance of the estimators for the whole tree estimate suffers when the tree size becomes larger. However, under certain models of cladogenesis, the estimate can improve with the addition of taxa. In fact, at all numbers of taxa there are subsets of tree models that are easier to estimate than others. This suggests that with judicious addition or subtraction of taxa we can move from tree models that are more difficult to estimate at one number of taxa to those that are easier to estimate at another number of taxa.
引用
收藏
页码:43 / 60
页数:18
相关论文
共 42 条
  • [1] Bourque M, 1978, THESIS U MONTREAL MO
  • [2] PROBABILITIES OF EVOLUTIONARY TREES
    BROWN, JKM
    [J]. SYSTEMATIC BIOLOGY, 1994, 43 (01) : 78 - 91
  • [3] Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency
    Chang, JT
    [J]. MATHEMATICAL BIOSCIENCES, 1996, 137 (01) : 51 - 73
  • [4] PHYLOGENETICS OF SEED PLANTS - AN ANALYSIS OF NUCLEOTIDE-SEQUENCES FROM THE PLASTID GENE RBCL
    CHASE, MW
    SOLTIS, DE
    OLMSTEAD, RG
    MORGAN, D
    LES, DH
    MISHLER, BD
    DUVALL, MR
    PRICE, RA
    HILLS, HG
    QIU, YL
    KRON, KA
    RETTIG, JH
    CONTI, E
    PALMER, JD
    MANHART, JR
    SYTSMA, KJ
    MICHAELS, HJ
    KRESS, WJ
    KAROL, KG
    CLARK, WD
    HEDREN, M
    GAUT, BS
    JANSEN, RK
    KIM, KJ
    WIMPEE, CF
    SMITH, JF
    FURNIER, GR
    STRAUSS, SH
    XIANG, QY
    PLUNKETT, GM
    SOLTIS, PS
    SWENSEN, SM
    WILLIAMS, SE
    GADEK, PA
    QUINN, CJ
    EGUIARTE, LE
    GOLENBERG, E
    LEARN, GH
    GRAHAM, SW
    BARRETT, SCH
    DAYANANDAN, S
    ALBERT, VA
    [J]. ANNALS OF THE MISSOURI BOTANICAL GARDEN, 1993, 80 (03) : 528 - 580
  • [5] A NON-EQUILIBRIUM THEORY FOR THE RATE-CONTROL OF SPECIATION AND EXTINCTION AND THE ORIGIN OF MACROEVOLUTIONARY PATTERNS
    CRACRAFT, J
    [J]. SYSTEMATIC ZOOLOGY, 1982, 31 (04): : 348 - 365
  • [7] ERDOS PL, 1997, P INT C AUT LANG PRO
  • [8] CASES IN WHICH PARSIMONY OR COMPATIBILITY METHODS WILL BE POSITIVELY MISLEADING
    FELSENSTEIN, J
    [J]. SYSTEMATIC ZOOLOGY, 1978, 27 (04): : 401 - 410
  • [9] NUMBER OF EVOLUTIONARY TREES
    FELSENSTEIN, J
    [J]. SYSTEMATIC ZOOLOGY, 1978, 27 (01): : 27 - 33
  • [10] Foulds L.R., 1982, ADV APPL MATH, V3, P43, DOI DOI 10.1016/S0196-8858(82)80004-3