INDELible: A Flexible Simulator of Biological Sequence Evolution

被引:293
作者
Fletcher, William [1 ]
Yang, Ziheng
机构
[1] UCL, Dept Genet Evolut & Environm, London, England
基金
英国生物技术与生命科学研究理事会;
关键词
indels; insertion; deletion; simulation; codon models; nonstationary process; AMINO-ACID SUBSTITUTION; MAXIMUM-LIKELIHOOD-ESTIMATION; DETECTING POSITIVE SELECTION; CODON-BASED MODEL; DNA-SEQUENCES; NUCLEOTIDE SUBSTITUTION; MITOCHONDRIAL-DNA; PROTEIN EVOLUTION; HUMAN GENOME; PHYLOGENETIC ESTIMATION;
D O I
10.1093/molbev/msp098
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Many methods exist for reconstructing phylogenies from molecular sequence data, but few phylogenies are known and can be used to check their efficacy. Simulation remains the most important approach to testing the accuracy and robustness of phylogenetic inference methods. However, current simulation programs are limited, especially concerning realistic models for simulating insertions and deletions. We implement a portable and flexible application, named INDELible, for generating nucleotide, amino acid and codon sequence data by simulating insertions and deletions (indels) as well as substitutions. Indels are simulated under several models of indel-length distribution. The program implements a rich repertoire of substitution models, including the general unrestricted model and nonstationary nonhomogeneous models of nucleotide substitution, mixture, and partition models that account for heterogeneity among sites, and codon models that allow the nonsynonymous/synonymous substitution rate ratio to vary among sites and branches. With its many unique features, INDELible should be useful for evaluating the performance of many inference methods, including those for multiple sequence alignment, phylogenetic tree inference, and ancestral sequence, or genome reconstruction.
引用
收藏
页码:1879 / 1888
页数:10
相关论文
共 85 条
[41]   An initial map of insertion and deletion (INDEL) variation in the human genome [J].
Mills, Ryan E. ;
Luttig, Christopher T. ;
Larkins, Christine E. ;
Beauchamp, Adam ;
Tsui, Circe ;
Pittard, W. Stephen ;
Devine, Scott E. .
GENOME RESEARCH, 2006, 16 (09) :1182-1190
[42]   Modeling amino acid replacement [J].
Müller, T ;
Vingron, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (06) :761-776
[43]   HIV-Specific Probabilistic Models of Protein Evolution [J].
Nickle, David C. ;
Heath, Laura ;
Jensen, Mark A. ;
Gilbert, Peter B. ;
Mullins, James I. ;
Pond, Sergei L. Kosakovsky .
PLOS ONE, 2007, 2 (06)
[44]  
Nielsen R, 1998, GENETICS, V148, P929
[45]   The accuracy of several multiple sequence alignment programs for proteins [J].
Nuin, Paulo A. S. ;
Wang, Zhouzhi ;
Tillier, Elisabeth R. M. .
BMC BIOINFORMATICS, 2006, 7 (1)
[46]   Indel-based evolutionary distance and mouse-human divergence [J].
Ogurtsov, AY ;
Sunyaev, S ;
Kondrashov, AS .
GENOME RESEARCH, 2004, 14 (08) :1610-1616
[47]   SIMPROT: Using an empirically determined indel distribution in simulations of protein evolution [J].
Pang, A ;
Smith, AD ;
Nuin, PAS ;
Tillier, ER .
BMC BIOINFORMATICS, 2005, 6 (1)
[48]   A codon-based model designed to describe lentiviral evolution [J].
Pedersen, AMK ;
Wiuf, C ;
Christiansen, FB .
MOLECULAR BIOLOGY AND EVOLUTION, 1998, 15 (08) :1069-1081
[49]  
Popescu I, 1997, ROM REP PHYS, V49, P3
[50]  
Popescu I.I., 2003, Glottometrics, V6, P83