Toward almost closed genomes with GapFiller

被引:863
作者
Boetzer, Marten [1 ]
Pirovano, Walter [1 ]
机构
[1] BaseClear BV, NL-2333 CC Leiden, Netherlands
来源
GENOME BIOLOGY | 2012年 / 13卷 / 06期
关键词
PAIRED READS; SEQUENCE; ASSEMBLIES; ALGORITHMS; ALIGNMENT;
D O I
10.1186/gb-2012-13-6-r56
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
De novo assembly is a commonly used application of next-generation sequencing experiments. The ultimate goal is to puzzle millions of reads into one complete genome, although draft assemblies usually result in a number of gapped scaffold sequences. In this paper we propose an automated strategy, called GapFiller, to reliably close gaps within scaffolds using paired reads. The method shows good results on both bacterial and eukaryotic datasets, allowing only few errors. As a consequence, the amount of additional wetlab work needed to close a genome is drastically reduced. The software is available at http://www.baseclear.com/bioinformatics-tools/.
引用
收藏
页数:9
相关论文
共 11 条
[1]   Scaffolding pre-assembled contigs using SSPACE [J].
Boetzer, Marten ;
Henkel, Christiaan V. ;
Jansen, Hans J. ;
Butler, Derek ;
Pirovano, Walter .
BIOINFORMATICS, 2011, 27 (04) :578-579
[2]   SOPRA: Scaffolding algorithm for paired reads via statistical optimization [J].
Dayarian, Adel ;
Michael, Todd P. ;
Sengupta, Anirvan M. .
BMC BIOINFORMATICS, 2010, 11
[3]   Quake: quality-aware detection and correction of sequencing errors [J].
Kelley, David R. ;
Schatz, Michael C. ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2010, 11 (11)
[4]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)
[5]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[6]   The sequence and de novo assembly of the giant panda genome [J].
Li, Ruiqiang ;
Fan, Wei ;
Tian, Geng ;
Zhu, Hongmei ;
He, Lin ;
Cai, Jing ;
Huang, Quanfei ;
Cai, Qingle ;
Li, Bo ;
Bai, Yinqi ;
Zhang, Zhihe ;
Zhang, Yaping ;
Wang, Wen ;
Li, Jun ;
Wei, Fuwen ;
Li, Heng ;
Jian, Min ;
Li, Jianwen ;
Zhang, Zhaolei ;
Nielsen, Rasmus ;
Li, Dawei ;
Gu, Wanjun ;
Yang, Zhentao ;
Xuan, Zhaoling ;
Ryder, Oliver A. ;
Leung, Frederick Chi-Ching ;
Zhou, Yan ;
Cao, Jianjun ;
Sun, Xiao ;
Fu, Yonggui ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Wang, Bo ;
Hou, Rong ;
Shen, Fujun ;
Mu, Bo ;
Ni, Peixiang ;
Lin, Runmao ;
Qian, Wubin ;
Wang, Guodong ;
Yu, Chang ;
Nie, Wenhui ;
Wang, Jinhuan ;
Wu, Zhigang ;
Liang, Huiqing ;
Min, Jiumeng ;
Wu, Qi ;
Cheng, Shifeng ;
Ruan, Jue ;
Wang, Mingwei .
NATURE, 2010, 463 (7279) :311-317
[7]   ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads [J].
MacCallum, Iain ;
Przybylski, Dariusz ;
Gnerre, Sante ;
Burton, Joshua ;
Shlyakhter, Ilya ;
Gnirke, Andreas ;
Malek, Joel ;
McKernan, Kevin ;
Ranade, Swati ;
Shea, Terrance P. ;
Williams, Louise ;
Young, Sarah ;
Nusbaum, Chad ;
Jaffe, David B. .
GENOME BIOLOGY, 2009, 10 (10)
[8]   GAGE: A critical evaluation of genome assemblies and assembly algorithms [J].
Salzberg, Steven L. ;
Phillippy, Adam M. ;
Zimin, Aleksey ;
Puiu, Daniela ;
Magoc, Tanja ;
Koren, Sergey ;
Treangen, Todd J. ;
Schatz, Michael C. ;
Delcher, Arthur L. ;
Roberts, Michael ;
Marcais, Guillaume ;
Pop, Mihai ;
Yorke, James A. .
GENOME RESEARCH, 2012, 22 (03) :557-567
[9]   ABySS: A parallel assembler for short read sequence data [J].
Simpson, Jared T. ;
Wong, Kim ;
Jackman, Shaun D. ;
Schein, Jacqueline E. ;
Jones, Steven J. M. ;
Birol, Inanc .
GENOME RESEARCH, 2009, 19 (06) :1117-1123
[10]   Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps [J].
Tsai, Isheng J. ;
Otto, Thomas D. ;
Berriman, Matthew .
GENOME BIOLOGY, 2010, 11 (04)