INDELible: A Flexible Simulator of Biological Sequence Evolution

被引:293
作者
Fletcher, William [1 ]
Yang, Ziheng
机构
[1] UCL, Dept Genet Evolut & Environm, London, England
基金
英国生物技术与生命科学研究理事会;
关键词
indels; insertion; deletion; simulation; codon models; nonstationary process; AMINO-ACID SUBSTITUTION; MAXIMUM-LIKELIHOOD-ESTIMATION; DETECTING POSITIVE SELECTION; CODON-BASED MODEL; DNA-SEQUENCES; NUCLEOTIDE SUBSTITUTION; MITOCHONDRIAL-DNA; PROTEIN EVOLUTION; HUMAN GENOME; PHYLOGENETIC ESTIMATION;
D O I
10.1093/molbev/msp098
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Many methods exist for reconstructing phylogenies from molecular sequence data, but few phylogenies are known and can be used to check their efficacy. Simulation remains the most important approach to testing the accuracy and robustness of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deletions. We implement a portable and flexible application, named INDELible, for generating nucleotide, amino acid and codon sequence data by simulating insertions and deletions (indels) as well as substitutions. Indels are simulated under several models of indel-length distribution. The program implements a rich repertoire of substitution models, including the general unrestricted model and nonstationary nonhomogeneous models of nucleotide substitution, mixture, and partition models that account for heterogeneity among sites, and codon models that allow the nonsynonymous/synonymous substitution rate ratio to vary among sites and branches. With its many unique features, INDELible should be useful for evaluating the performance of many inference methods, including those for multiple sequence alignment, phylogenetic tree inference, and ancestral sequence, or genome reconstruction.
引用
收藏
页码:1879 / 1888
页数:10
相关论文
共 85 条
[51]   Distribution of indel lengths [J].
Qian, B ;
Goldstein, RA .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 45 (01) :102-104
[52]  
Rambaut A, 1997, COMPUT APPL BIOSCI, V13, P235
[53]  
Rosenberg MS, 2005, EVOL BIOINFORM, V1, P81
[54]   Phylogenetic estimation of context-dependent substitution rates by maximum likelihood [J].
Siepel, A ;
Haussler, D .
MOLECULAR BIOLOGY AND EVOLUTION, 2004, 21 (03) :468-488
[55]   Patterns in spontaneous mutation revealed by human-baboon sequence comparison [J].
Silva, JC ;
Kondrashov, AS .
TRENDS IN GENETICS, 2002, 18 (11) :544-547
[56]   Exploring tree-building methods and distinct molecular data to recover a known asymmetric phage phylogeny [J].
Sousa, Ana ;
Ze-Ze, Libia ;
Silva, Pedro ;
Tenreiro, Rogerio .
MOLECULAR PHYLOGENETICS AND EVOLUTION, 2008, 48 (02) :563-573
[57]   Rose: generating sequence families [J].
Stoye, J ;
Evers, D ;
Meyer, F .
BIOINFORMATICS, 1998, 14 (02) :157-163
[58]   Indel-Seq-Gen: A new protein family simulator incorporating domains, motifs, and indels [J].
Strope, Cory L. ;
Scott, Stephen D. ;
Moriyama, Etsuko N. .
MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (03) :640-649
[59]  
Swofford D.L., 1996, MOL SYSTEMATICS, P411
[60]  
TAMURA K, 1992, MOL BIOL EVOL, V9, P678