Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding

被引:343
作者
McKernan, Kevin Judd [1 ]
Peckham, Heather E. [1 ]
Costa, Gina L. [1 ]
McLaughlin, Stephen F. [1 ]
Fu, Yutao [1 ]
Tsung, Eric F. [1 ]
Clouser, Christopher R. [1 ]
Duncan, Cisyla [1 ]
Ichikawa, Jeffrey K. [1 ]
Lee, Clarence C. [1 ]
Zhang, Zheng [2 ]
Ranade, Swati S. [2 ]
Dimalanta, Eileen T. [1 ]
Hyland, Fiona C. [2 ]
Sokolsky, Tanya D. [1 ]
Zhang, Lei [1 ]
Sheridan, Andrew [1 ]
Fu, Haoning [2 ]
Hendrickson, Cynthia L. [3 ]
Li, Bin [2 ]
Kotler, Lev [1 ]
Stuart, Jeremy R. [1 ]
Malek, Joel A. [4 ]
Manning, Jonathan M. [1 ]
Antipova, Alena A. [1 ]
Perez, Damon S. [1 ]
Moore, Michael P. [1 ]
Hayashibara, Kathleen C. [2 ]
Lyons, Michael R. [1 ]
Beaudoin, Robert E. [1 ]
Coleman, Brittany E. [1 ]
Laptewicz, Michael W. [1 ]
Sannicandro, Adam E. [1 ]
Rhodes, Michael D. [2 ]
Gottimukkala, Rajesh K. [2 ]
Yang, Shan [2 ]
Bafna, Vineet [5 ]
Bashir, Ali [5 ]
MacBride, Andrew [6 ]
Alkan, Can [7 ]
Kidd, Jeffrey M. [7 ]
Eichler, Evan E. [7 ]
Reese, Martin G. [6 ]
De la Vega, Francisco M. [2 ]
Blanchard, Alan P. [1 ]
机构
[1] Life Technol, Beverly, MA 01915 USA
[2] Life Technol, Foster City, CA 94404 USA
[3] New England Biolabs Inc, Ipswich, MA 01938 USA
[4] Weill Cornell Med Coll Qatar, Doha, Qatar
[5] Univ Calif San Diego, La Jolla, CA 92093 USA
[6] Omicia Inc, Emeryville, CA 94608 USA
[7] Univ Washington, Dept Genome Sci, Sch Med, Seattle, WA 98195 USA
关键词
SINGLE DNA-MOLECULES; DELETION POLYMORPHISM; HAPLOTYPE MAP; HUMAN-DISEASE; COPY-NUMBER; GENES; SNPS; IDENTIFICATION; EXPRESSION; DISCOVERY;
D O I
10.1101/gr.091868.109
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding; 183 haploid coverage of aligned sequence and close to 3003 clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.
引用
收藏
页码:1527 / 1541
页数:15
相关论文
共 53 条
[1]   A haplotype map of the human genome [J].
Altshuler, D ;
Brooks, LD ;
Chakravarti, A ;
Collins, FS ;
Daly, MJ ;
Donnelly, P ;
Gibbs, RA ;
Belmont, JW ;
Boudreau, A ;
Leal, SM ;
Hardenbol, P ;
Pasternak, S ;
Wheeler, DA ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Zeng, CQ ;
Gao, Y ;
Hu, HR ;
Hu, WT ;
Li, CH ;
Lin, W ;
Liu, SQ ;
Pan, H ;
Tang, XL ;
Wang, J ;
Wang, W ;
Yu, J ;
Zhang, B ;
Zhang, QR ;
Zhao, HB ;
Zhao, H ;
Zhou, J ;
Gabriel, SB ;
Barry, R ;
Blumenstiel, B ;
Camargo, A ;
Defelice, M ;
Faggart, M ;
Goyette, M ;
Gupta, S ;
Moore, J ;
Nguyen, H ;
Onofrio, RC ;
Parkin, M ;
Roy, J ;
Stahl, E ;
Winchester, E ;
Ziaugra, L ;
Shen, Y .
NATURE, 2005, 437 (7063) :1299-1320
[2]   Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer [J].
Bashir, Ali ;
Volik, Stanislav ;
Collins, Colin ;
Bafna, Vineet ;
Raphael, Benjamin J. .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (04)
[3]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[4]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[5]   Sequence information can be obtained from single DNA molecules [J].
Braslavsky, I ;
Hebert, B ;
Kartalov, E ;
Quake, SR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (07) :3960-3964
[6]   Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays [J].
Brenner, S ;
Johnson, M ;
Bridgham, J ;
Golda, G ;
Lloyd, DH ;
Johnson, D ;
Luo, SJ ;
McCurdy, S ;
Foy, M ;
Ewan, M ;
Roth, R ;
George, D ;
Eletr, S ;
Albrecht, G ;
Vermaas, E ;
Williams, SR ;
Moon, K ;
Burcham, T ;
Pallas, M ;
DuBridge, RB ;
Kirchner, J ;
Fearon, K ;
Mao, J ;
Corcoran, K .
NATURE BIOTECHNOLOGY, 2000, 18 (06) :630-634
[7]   Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing [J].
Campbell, Peter J. ;
Stephens, Philip J. ;
Pleasance, Erin D. ;
O'Meara, Sarah ;
Li, Heng ;
Santarius, Thomas ;
Stebbings, Lucy A. ;
Leroy, Catherine ;
Edkins, Sarah ;
Hardy, Claire ;
Teague, Jon W. ;
Menzies, Andrew ;
Goodhead, Ian ;
Turner, Daniel J. ;
Clee, Christopher M. ;
Quail, Michael A. ;
Cox, Antony ;
Brown, Clive ;
Durbin, Richard ;
Hurles, Matthew E. ;
Edwards, Paul A. W. ;
Bignell, Graham R. ;
Stratton, Michael R. ;
Futreal, P. Andrew .
NATURE GENETICS, 2008, 40 (06) :722-729
[8]   Putative alternative trans-splicing of leukocyte adhesion-GPCR pre-mRNAs generates functional chimeric receptors [J].
Chiu, Pei-Ling ;
Ng, Boon Han ;
Chang, Gin-Wen ;
Gordon, Siamon ;
Lin, Hsi-Hsien .
FEBS LETTERS, 2008, 582 (05) :792-798
[9]   Stem cell transcriptome profiling via massive-scale mRNA sequencing [J].
Cloonan, Nicole ;
Forrest, Alistair R. R. ;
Kolle, Gabriel ;
Gardiner, Brooke B. A. ;
Faulkner, Geoffrey J. ;
Brown, Mellissa K. ;
Taylor, Darrin F. ;
Steptoe, Anita L. ;
Wani, Shivangi ;
Bethel, Graeme ;
Robertson, Alan J. ;
Perkins, Andrew C. ;
Bruce, Stephen J. ;
Lee, Clarence C. ;
Ranade, Swati S. ;
Peckham, Heather E. ;
Manning, Jonathan M. ;
McKernan, Kevin J. ;
Grimmond, Sean M. .
NATURE METHODS, 2008, 5 (07) :613-619
[10]   Transforming single DNA molecules into fluorescent magnetic particles for detection and enumeration of genetic variations [J].
Dressman, D ;
Yan, H ;
Traverso, G ;
Kinzler, KW ;
Vogelstein, B .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (15) :8817-8822