A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data

被引:116
作者
Lewis, PO [1 ]
机构
[1] Univ New Mexico, Dept Biol, Albuquerque, NM 87131 USA
关键词
genetic algorithm; phylogeny inference; phylogeny reconstruction; maximum likelihood; nucleotide sequence data; rbcL;
D O I
10.1093/oxfordjournals.molbev.a025924
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Phylogeny reconstruction is a difficult computational problem, because the number of possible solutions increases with the number of included taxa. For example, for only 14 taxa, there are more than seven trillion possible unrooted phylogenetic trees. For this reason, phylogenetic inference methods commonly use clustering algorithms (e.g., the neighbor-joining method) or heuristic search strategies to minimize the amount of time spent evaluating nonoptimal trees. Even heuristic searches can be painfully slow, especially when computationally intensive optimality criteria such as maximum likelihood are used. I describe here a different approach to heuristic searching (using a genetic algorithm) that can tremendously reduce the time required for maximum-likelihood phylogenetic inference, especially for data sets involving large numbers of taxa. Genetic algorithms are simulations of natural selection in which individuals are encoded solutions to the problem of interest. Here, labeled phylogenetic trees are the individuals, and differential reproduction is effected by allowing the number of offspring produced by each individual to be proportional to that individual's rank likelihood score. Natural selection increases the average likelihood in the evolving population of phylogenetic trees, and the genetic algorithm is allowed to proceed until the likelihood of the best individual ceases to improve over time. An example is presented involving rbcL sequence data for 55 taxa of green plants. The genetic algorithm described here required only 6% of the computational effort required by a conventional heuristic search using tree bisection/reconnection (TBR) branch swapping to obtain the same maximum-likelihood topology.
引用
收藏
页码:277 / 283
页数:7
相关论文
共 31 条
  • [1] PHYLOGENETICS OF SEED PLANTS - AN ANALYSIS OF NUCLEOTIDE-SEQUENCES FROM THE PLASTID GENE RBCL
    CHASE, MW
    SOLTIS, DE
    OLMSTEAD, RG
    MORGAN, D
    LES, DH
    MISHLER, BD
    DUVALL, MR
    PRICE, RA
    HILLS, HG
    QIU, YL
    KRON, KA
    RETTIG, JH
    CONTI, E
    PALMER, JD
    MANHART, JR
    SYTSMA, KJ
    MICHAELS, HJ
    KRESS, WJ
    KAROL, KG
    CLARK, WD
    HEDREN, M
    GAUT, BS
    JANSEN, RK
    KIM, KJ
    WIMPEE, CF
    SMITH, JF
    FURNIER, GR
    STRAUSS, SH
    XIANG, QY
    PLUNKETT, GM
    SOLTIS, PS
    SWENSEN, SM
    WILLIAMS, SE
    GADEK, PA
    QUINN, CJ
    EGUIARTE, LE
    GOLENBERG, E
    LEARN, GH
    GRAHAM, SW
    BARRETT, SCH
    DAYANANDAN, S
    ALBERT, VA
    [J]. ANNALS OF THE MISSOURI BOTANICAL GARDEN, 1993, 80 (03) : 528 - 580
  • [3] EDWARDS A. W. F., 1964, SYST ASS PUBLICATION, V6, P67
  • [4] A hidden Markov Model approach to variation among sites in rate of evolution
    Felsenstein, J
    Churchill, GA
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 1996, 13 (01) : 93 - 104
  • [5] Felsenstein J, 1995, PHYLIP PHYLOGENY INF
  • [6] GENETIC ALGORITHMS - PRINCIPLES OF NATURAL-SELECTION APPLIED TO COMPUTATION
    FORREST, S
    [J]. SCIENCE, 1993, 261 (5123) : 872 - 878
  • [7] SUCCESS OF MAXIMUM-LIKELIHOOD PHYLOGENY INFERENCE IN THE 4-TAXON CASE
    GAUT, BS
    LEWIS, PO
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 1995, 12 (01) : 152 - 162
  • [8] GOLDMAN N, 1994, MOL BIOL EVOL, V11, P725
  • [9] DATING OF THE HUMAN APE SPLITTING BY A MOLECULAR CLOCK OF MITOCHONDRIAL-DNA
    HASEGAWA, M
    KISHINO, H
    YANO, TA
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1985, 22 (02) : 160 - 174
  • [10] Inferring complex phylogenies
    Hillis, DM
    [J]. NATURE, 1996, 383 (6596) : 130 - 131