QSRA - a quality-value guided de novo short read assembler

被引:29
作者
Bryant, Douglas W., Jr. [1 ]
Wong, Weng-Keen [1 ]
Mockler, Todd C. [2 ,3 ]
机构
[1] Oregon State Univ, Dept Elect Engn & Comp Sci, Corvallis, OR 97331 USA
[2] Oregon State Univ, Dept Bot & Plant Pathol, Corvallis, OR 97331 USA
[3] Oregon State Univ, Ctr Genome Res & Biocomp, Corvallis, OR 97331 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
Reference Genome; Contig Length; Longe Contig; Prefix Tree; Winning Base;
D O I
10.1186/1471-2105-10-69
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data. Results: We have designed and implemented an assembler, Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality. Conclusion: QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.
引用
收藏
页数:6
相关论文
共 12 条
[1]  
[Anonymous], STREPTOCOCCUS SUIS R
[2]  
[Anonymous], STREPTOCOCCUS SUIS I
[3]   ALLPATHS: De novo assembly of whole-genome shotgun microreads [J].
Butler, Jonathan ;
MacCallum, Iain ;
Kleber, Michael ;
Shlyakhter, Ilya A. ;
Belmonte, Matthew K. ;
Lander, Eric S. ;
Nusbaum, Chad ;
Jaffe, David B. .
GENOME RESEARCH, 2008, 18 (05) :810-820
[4]   Short read fragment assembly of bacterial genomes [J].
Chaisson, Mark J. ;
Pevzner, Pavel A. .
GENOME RESEARCH, 2008, 18 (02) :324-330
[5]   Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology [J].
Cronn, Richard ;
Liston, Aaron ;
Parks, Matthew ;
Gernandt, David S. ;
Shen, Rongkun ;
Mockler, Todd .
NUCLEIC ACIDS RESEARCH, 2008, 36 (19)
[6]  
de Bruijn N. G., 1946, P SECT SCI KONINKLIJ, V49, P758
[7]   SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing [J].
Dohm, Juliane C. ;
Lottaz, Claudio ;
Borodina, Tatiana ;
Himmelbauer, Heinz .
GENOME RESEARCH, 2007, 17 (11) :1697-1706
[8]   De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer [J].
Hernandez, David ;
Francois, Patrice ;
Farinelli, Laurent ;
Osteras, Magne ;
Schrenzel, Jacques .
GENOME RESEARCH, 2008, 18 (05) :802-809
[9]   Extending assembly of short DNA sequences to handle error [J].
Jeck, William R. ;
Reinhardt, Josephine A. ;
Baltrus, David A. ;
Hickenbotham, Matthew T. ;
Magrini, Vincent ;
Mardis, Elaine R. ;
Dangl, Jeffery L. ;
Jones, Corbin D. .
BIOINFORMATICS, 2007, 23 (21) :2942-2944
[10]  
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202. Article published online before March 2002, 10.1101/gr.229202]