Gap5-editing the billion fragment sequence assembly

被引:151
作者
Bonfield, James K. [1 ]
Whitwham, Andrew [1 ]
机构
[1] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
基金
英国医学研究理事会; 英国惠康基金;
关键词
GENOME; TOOL;
D O I
10.1093/bioinformatics/btq268
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Existing sequence assembly editors struggle with the volumes of data now readily available from the latest generation of DNA sequencing instruments. Results: We describe the Gap5 software along with the data structures and algorithms used that allow it to be scalable. We demonstrate this with an assembly of 1.1 billion sequence fragments and compare the performance with several other programs. We analyse the memory, CPU, I/O usage and file sizes used by Gap5.
引用
收藏
页码:1699 / 1703
页数:5
相关论文
共 21 条
[1]  
[Anonymous], 1996, 1950 RFC
[2]   NGSView: an extensible open source editor for next-generation sequencing data [J].
Arner, Erik ;
Hayashizaki, Yoshihide ;
Daub, Carsten O. .
BIOINFORMATICS, 2010, 26 (01) :125-126
[3]   MapView: visualization of short reads alignment on a desktop computer [J].
Bao, Hua ;
Guo, Hui ;
Wang, Jinwei ;
Zhou, Renchao ;
Lu, Xuemei ;
Shi, Suhua .
BIOINFORMATICS, 2009, 25 (12) :1554-1555
[4]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[5]   A new DNA sequence assembly program [J].
Bonfield, JK ;
Smith, KF ;
Staden, R .
NUCLEIC ACIDS RESEARCH, 1995, 23 (24) :4992-4999
[6]   Genome Project Standards in a New Era of Sequencing [J].
Chain, P. S. G. ;
Grafham, D. V. ;
Fulton, R. S. ;
FitzGerald, M. G. ;
Hostetler, J. ;
Muzny, D. ;
Ali, J. ;
Birren, B. ;
Bruce, D. C. ;
Buhay, C. ;
Cole, J. R. ;
Ding, Y. ;
Dugan, S. ;
Field, D. ;
Garrity, G. M. ;
Gibbs, R. ;
Graves, T. ;
Han, C. S. ;
Harrison, S. H. ;
Highlander, S. ;
Hugenholtz, P. ;
Khouri, H. M. ;
Kodira, C. D. ;
Kolker, E. ;
Kyrpides, N. C. ;
Lang, D. ;
Lapidus, A. ;
Malfatti, S. A. ;
Markowitz, V. ;
Metha, T. ;
Nelson, K. E. ;
Parkhill, J. ;
Pitluck, S. ;
Qin, X. ;
Read, T. D. ;
Schmutz, J. ;
Sozhamannan, S. ;
Sterk, P. ;
Strausberg, R. L. ;
Sutton, G. ;
Thomson, N. R. ;
Tiedje, J. M. ;
Weinstock, G. ;
Wollam, A. ;
Detter, J. C. .
SCIENCE, 2009, 326 (5950) :236-237
[7]   Sequence assembly with CAFTOOLS [J].
Dear, S ;
Durbin, R ;
Hillier, L ;
Marth, G ;
Thierry-Mieg, J ;
Mott, R .
GENOME RESEARCH, 1998, 8 (03) :260-267
[8]   Consed: A graphical tool for sequence finishing [J].
Gordon, D ;
Abajian, C ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :195-202
[9]  
Guttman Antonin., 1984, SIGMOD Conference, P47, DOI [10.1145/971697.602266, DOI 10.1145/971697.602266, DOI 10.1145/602259.602266]
[10]   EagleView: A genome assembly viewer for next-generation sequencing technologies [J].
Huang, Weichun ;
Marth, Gabor .
GENOME RESEARCH, 2008, 18 (09) :1538-1543