High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing

被引:194
作者
Lou, Dianne I. [1 ]
Hussmann, Jeffrey A. [2 ]
McBee, Ross M. [1 ]
Acevedo, Ashley [4 ]
Andino, Raul [4 ]
Press, William H. [2 ,3 ]
Sawyer, Sara L. [1 ]
机构
[1] Univ Texas Austin, Dept Mol Biosci, Austin, TX 78712 USA
[2] Univ Texas Austin, Inst Computat Engn & Sci, Austin, TX 78712 USA
[3] Univ Texas Austin, Dept Integrat Biol, Austin, TX 78712 USA
[4] Univ Calif San Francisco, Dept Microbiol & Immunol, San Francisco, CA 94122 USA
基金
美国国家卫生研究院;
关键词
next-generation sequencing; barcoding; rare variants; RARE MUTATIONS; PARALLEL;
D O I
10.1073/pnas.1319590110
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A major limitation of high-throughput DNA sequencing is the high rate of erroneous base calls produced. For instance, Illumina sequencing machines produce errors at a rate of similar to 0.1-1 x 10(-2) per base sequenced. These technologies typically produce billions of base calls per experiment, translating to millions of errors. We have developed a unique library preparation strategy, "circle sequencing," which allows for robust downstream computational correction of these errors. In this strategy, DNA templates are circularized, copied multiple times in tandem with a rolling circle polymerase, and then sequenced on any high-throughput sequencing machine. Each read produced is computationally processed to obtain a consensus sequence of all linked copies of the original molecule. Physically linking the copies ensures that each copy is independently derived from the original molecule and allows for efficient formation of consensus sequences. The circlesequencing protocol precedes standard library preparations and is therefore suitable for a broad range of sequencing applications. We tested our method using the Illumina MiSeq platform and obtained errors in our processed sequencing reads at a rate as low as 7.6 x 10(-6) per base sequenced, dramatically improving the error rate of Illumina sequencing and putting error on par with low-throughput, but highly accurate, Sanger sequencing. Circle sequencing also had substantially higher efficiency and lower cost than existing barcode-based schemes for correcting sequencing errors.
引用
收藏
页码:19872 / 19877
页数:6
相关论文
共 12 条
[1]  
Acevedo A, 2014, NATURE IN PRESS
[2]   A method for counting PCR template molecules with application to next-generation sequencing [J].
Casbon, James A. ;
Osborne, Robert J. ;
Brenner, Sydney ;
Lichtenstein, Conrad P. .
NUCLEIC ACIDS RESEARCH, 2011, 39 (12) :e81
[3]   Real-Time DNA Sequencing from Single Polymerase Molecules [J].
Eid, John ;
Fehr, Adrian ;
Gray, Jeremy ;
Luong, Khai ;
Lyle, John ;
Otto, Geoff ;
Peluso, Paul ;
Rank, David ;
Baybayan, Primo ;
Bettman, Brad ;
Bibillo, Arkadiusz ;
Bjornson, Keith ;
Chaudhuri, Bidhan ;
Christians, Frederick ;
Cicero, Ronald ;
Clark, Sonya ;
Dalal, Ravindra ;
deWinter, Alex ;
Dixon, John ;
Foquet, Mathieu ;
Gaertner, Alfred ;
Hardenbol, Paul ;
Heiner, Cheryl ;
Hester, Kevin ;
Holden, David ;
Kearns, Gregory ;
Kong, Xiangxu ;
Kuse, Ronald ;
Lacroix, Yves ;
Lin, Steven ;
Lundquist, Paul ;
Ma, Congcong ;
Marks, Patrick ;
Maxham, Mark ;
Murphy, Devon ;
Park, Insil ;
Pham, Thang ;
Phillips, Michael ;
Roy, Joy ;
Sebra, Robert ;
Shen, Gene ;
Sorenson, Jon ;
Tomaney, Austin ;
Travers, Kevin ;
Trulson, Mark ;
Vieceli, John ;
Wegener, Jeffrey ;
Wu, Dawn ;
Yang, Alicia ;
Zaccarin, Denis .
SCIENCE, 2009, 323 (5910) :133-138
[4]  
Hiatt JB, 2010, NAT METHODS, V7, P119, DOI [10.1038/nmeth.1416, 10.1038/NMETH.1416]
[5]   Accurate sampling and deep sequencing of the HIV-1 protease gene using a Primer ID [J].
Jabara, Cassandra B. ;
Jones, Corbin D. ;
Roach, Jeffrey ;
Anderson, Jeffrey A. ;
Swanstrom, Ronald .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (50) :20166-20171
[6]   Updating benchtop sequencing performance comparison [J].
Juenemann, Sebastian ;
Sedlazeck, Fritz Joachim ;
Prior, Karola ;
Albersmeier, Andreas ;
John, Uwe ;
Kalinowski, Joern ;
Mellmann, Alexander ;
Goesmann, Alexander ;
von Haeseler, Arndt ;
Stoye, Jens ;
Harmsen, Dag .
NATURE BIOTECHNOLOGY, 2013, 31 (04) :294-296
[7]   Detection and quantification of rare mutations with massively parallel sequencing [J].
Kinde, Isaac ;
Wu, Jian ;
Papadopoulos, Nick ;
Kinzler, Kenneth W. ;
Vogelstein, Bert .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (23) :9530-9535
[8]   Performance comparison of benchtop high-throughput sequencing platforms [J].
Loman, Nicholas J. ;
Misra, Raju V. ;
Dallman, Timothy J. ;
Constantinidou, Chrystala ;
Gharbia, Saheer E. ;
Wain, John ;
Pallen, Mark J. .
NATURE BIOTECHNOLOGY, 2012, 30 (05) :434-+
[9]   Identification and correction of systematic error in high-throughput sequence data [J].
Meacham, Frazer ;
Boffelli, Dario ;
Dhahbi, Joseph ;
Martin, David I. K. ;
Singer, Meromit ;
Pachter, Lior .
BMC BIOINFORMATICS, 2011, 12
[10]   Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing [J].
Reumers, Joke ;
De Rijk, Peter ;
Zhao, Hui ;
Liekens, Anthony ;
Smeets, Dominiek ;
Cleary, John ;
Van Loo, Peter ;
Van Den Bossche, Maarten ;
Catthoor, Kirsten ;
Sabbe, Bernard ;
Despierre, Evelyn ;
Vergote, Ignace ;
Hilbush, Brian ;
Lambrechts, Diether ;
Del-Favero, Jurgen .
NATURE BIOTECHNOLOGY, 2012, 30 (01) :61-U103