INDELible: A Flexible Simulator of Biological Sequence Evolution

被引:293
作者
Fletcher, William [1 ]
Yang, Ziheng
机构
[1] UCL, Dept Genet Evolut & Environm, London, England
基金
英国生物技术与生命科学研究理事会;
关键词
indels; insertion; deletion; simulation; codon models; nonstationary process; AMINO-ACID SUBSTITUTION; MAXIMUM-LIKELIHOOD-ESTIMATION; DETECTING POSITIVE SELECTION; CODON-BASED MODEL; DNA-SEQUENCES; NUCLEOTIDE SUBSTITUTION; MITOCHONDRIAL-DNA; PROTEIN EVOLUTION; HUMAN GENOME; PHYLOGENETIC ESTIMATION;
D O I
10.1093/molbev/msp098
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Many methods exist for reconstructing phylogenies from molecular sequence data, but few phylogenies are known and can be used to check their efficacy. Simulation remains the most important approach to testing the accuracy and robustness of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deletions. We implement a portable and flexible application, named INDELible, for generating nucleotide, amino acid and codon sequence data by simulating insertions and deletions (indels) as well as substitutions. Indels are simulated under several models of indel-length distribution. The program implements a rich repertoire of substitution models, including the general unrestricted model and nonstationary nonhomogeneous models of nucleotide substitution, mixture, and partition models that account for heterogeneity among sites, and codon models that allow the nonsynonymous/synonymous substitution rate ratio to vary among sites and branches. With its many unique features, INDELible should be useful for evaluating the performance of many inference methods, including those for multiple sequence alignment, phylogenetic tree inference, and ancestral sequence, or genome reconstruction.
引用
收藏
页码:1879 / 1888
页数:10
相关论文
共 85 条
[1]  
Abascal F, 2007, MOL BIOL EVOL, V24, P1
[2]   Plastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA [J].
Adachi, J ;
Waddell, PJ ;
Martin, W ;
Hasegawa, M .
JOURNAL OF MOLECULAR EVOLUTION, 2000, 50 (04) :348-358
[3]  
Adachi J., 1996, COMP SCI MONOGR, V28, P1
[4]   Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models [J].
Anisimova, Maria ;
Kosiol, Carolin .
MOLECULAR BIOLOGY AND EVOLUTION, 2009, 26 (02) :255-271
[5]   Regional and time-resolved mutation patterns of the human genome [J].
Arndt, PF ;
Hwa, T .
BIOINFORMATICS, 2004, 20 (10) :1482-1485
[6]   EMPIRICAL AND STRUCTURAL MODELS FOR INSERTIONS AND DELETIONS IN THE DIVERGENT EVOLUTION OF PROTEINS [J].
BENNER, SA ;
COHEN, MA ;
GONNET, GH .
JOURNAL OF MOLECULAR BIOLOGY, 1993, 229 (04) :1065-1082
[7]   MAXIMUM-LIKELIHOOD ALIGNMENT OF DNA-SEQUENCES [J].
BISHOP, MJ ;
THOMPSON, EA .
JOURNAL OF MOLECULAR BIOLOGY, 1986, 190 (02) :159-165
[8]   Reconstructing large regions of an ancestral mammalian genome in silico [J].
Blanchette, M ;
Green, ED ;
Miller, W ;
Haussler, D .
GENOME RESEARCH, 2004, 14 (12) :2412-2423
[9]   A Bayesian compound stochastic process for modeling nonstationary and nonhomogeneous sequence evolution [J].
Blanquart, Samuel ;
Lartillot, Nicolas .
MOLECULAR BIOLOGY AND EVOLUTION, 2006, 23 (11) :2058-2071
[10]   Majority of divergence between closely related DNA samples is due to indels [J].
Britten, RJ ;
Rowen, L ;
Williams, J ;
Cameron, RA .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (08) :4661-4665