Accurate and comprehensive sequencing of personal genomes

被引:137
作者
Ajay, Subramanian S. [1 ]
Parker, Stephen C. J. [1 ]
Abaan, Hatice Ozel [1 ]
Fajardo, Karin V. Fuentes [2 ]
Margulies, Elliott H. [1 ]
机构
[1] NHGRI, Genome Informat Sect, Genome Technol Branch, NIH, Bethesda, MD 20892 USA
[2] NHGRI, Undiagnosed Dis Program, Off Clin Director, NIH, Bethesda, MD 20892 USA
基金
美国国家卫生研究院;
关键词
SHORT-READ; DISCOVERY; PATIENT;
D O I
10.1101/gr.123638.111
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
As whole-genome sequencing becomes commoditized and we begin to sequence and analyze personal genomes for clinical and diagnostic purposes, it is necessary to understand what constitutes a complete sequencing experiment for determining genotypes and detecting single-nucleotide variants. Here, we show that the current recommendation of similar to 30x coverage is not adequate to produce genotype calls across a large fraction of the genome with acceptably low error rates. Our results are based on analyses of a clinical sample sequenced on two related Illumina platforms, GAII(x) and HiSeq 2000, to a very high depth (126x). We used these data to establish genotype-calling filters that dramatically increase accuracy. We also empirically determined how the callable portion of the genome varies as a function of the amount of sequence data used. These results help provide a "sequencing guide'' for future whole-genome sequencing decisions and metrics by which coverage statistics should be reported.
引用
收藏
页码:1498 / 1505
页数:8
相关论文
共 29 条
  • [1] Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries
    Aird, Daniel
    Ross, Michael G.
    Chen, Wei-Sheng
    Danielsson, Maxwell
    Fennell, Timothy
    Russ, Carsten
    Jaffe, David B.
    Nusbaum, Chad
    Gnirke, Andreas
    [J]. GENOME BIOLOGY, 2011, 12 (02)
  • [2] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [3] Accurate whole human genome sequencing using reversible terminator chemistry
    Bentley, David R.
    Balasubramanian, Shankar
    Swerdlow, Harold P.
    Smith, Geoffrey P.
    Milton, John
    Brown, Clive G.
    Hall, Kevin P.
    Evers, Dirk J.
    Barnes, Colin L.
    Bignell, Helen R.
    Boutell, Jonathan M.
    Bryant, Jason
    Carter, Richard J.
    Cheetham, R. Keira
    Cox, Anthony J.
    Ellis, Darren J.
    Flatbush, Michael R.
    Gormley, Niall A.
    Humphray, Sean J.
    Irving, Leslie J.
    Karbelashvili, Mirian S.
    Kirk, Scott M.
    Li, Heng
    Liu, Xiaohai
    Maisinger, Klaus S.
    Murray, Lisa J.
    Obradovic, Bojan
    Ost, Tobias
    Parkinson, Michael L.
    Pratt, Mark R.
    Rasolonjatovo, Isabelle M. J.
    Reed, Mark T.
    Rigatti, Roberto
    Rodighiero, Chiara
    Ross, Mark T.
    Sabot, Andrea
    Sankar, Subramanian V.
    Scally, Aylwyn
    Schroth, Gary P.
    Smith, Mark E.
    Smith, Vincent P.
    Spiridou, Anastassia
    Torrance, Peta E.
    Tzonev, Svilen S.
    Vermaas, Eric H.
    Walter, Klaudia
    Wu, Xiaolin
    Zhang, Lu
    Alam, Mohammed D.
    Anastasi, Carole
    [J]. NATURE, 2008, 456 (7218) : 53 - 59
  • [4] Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing
    Campbell, Peter J.
    Stephens, Philip J.
    Pleasance, Erin D.
    O'Meara, Sarah
    Li, Heng
    Santarius, Thomas
    Stebbings, Lucy A.
    Leroy, Catherine
    Edkins, Sarah
    Hardy, Claire
    Teague, Jon W.
    Menzies, Andrew
    Goodhead, Ian
    Turner, Daniel J.
    Clee, Christopher M.
    Quail, Michael A.
    Cox, Antony
    Brown, Clive
    Durbin, Richard
    Hurles, Matthew E.
    Edwards, Paul A. W.
    Bignell, Graham R.
    Stratton, Michael R.
    Futreal, P. Andrew
    [J]. NATURE GENETICS, 2008, 40 (06) : 722 - 729
  • [5] Mapping translocation breakpoints by next-generation sequencing
    Chen, Wei
    Kalscheuer, Vera
    Tzschach, Andreas
    Menzel, Corinna
    Ullmann, Reinhard
    Schulz, Marcel Holger
    Erdogan, Fikret
    Li, Na
    Kijas, Zofia
    Arkesteijn, Ger
    Pajares, Isidora Lopez
    Goetz-Sothmann, Margret
    Heinrich, Uwe
    Rost, Imma
    Dufke, Andreas
    Grasshoff, Ute
    Glaeser, Birgitta
    Vingron, Martin
    Ropers, H. Hilger
    [J]. GENOME RESEARCH, 2008, 18 (07) : 1143 - 1149
  • [6] High-resolution mapping of copy-number alterations with massively parallel sequencing
    Chiang, Derek Y.
    Getz, Gad
    Jaffe, David B.
    O'Kelly, Michael J. T.
    Zhao, Xiaojun
    Carter, Scott L.
    Russ, Carsten
    Nusbaum, Chad
    Meyerson, Matthew
    Lander, Eric S.
    [J]. NATURE METHODS, 2009, 6 (01) : 99 - 103
  • [7] Finishing the euchromatic sequence of the human genome
    Collins, FS
    Lander, ES
    Rogers, J
    Waterston, RH
    [J]. NATURE, 2004, 431 (7011) : 931 - 945
  • [8] A framework for variation discovery and genotyping using next-generation DNA sequencing data
    DePristo, Mark A.
    Banks, Eric
    Poplin, Ryan
    Garimella, Kiran V.
    Maguire, Jared R.
    Hartl, Christopher
    Philippakis, Anthony A.
    del Angel, Guillermo
    Rivas, Manuel A.
    Hanna, Matt
    McKenna, Aaron
    Fennell, Tim J.
    Kernytsky, Andrew M.
    Sivachenko, Andrey Y.
    Cibulskis, Kristian
    Gabriel, Stacey B.
    Altshuler, David
    Daly, Mark J.
    [J]. NATURE GENETICS, 2011, 43 (05) : 491 - +
  • [9] Whole-genome sequencing and comprehensive variant analysis of a Japanese individual using massively parallel sequencing
    Fujimoto, Akihiro
    Nakagawa, Hidewaki
    Hosono, Naoya
    Nakano, Kaoru
    Abe, Tetsuo
    Boroevich, Keith A.
    Nagasaki, Masao
    Yamaguchi, Rui
    Shibuya, Tetsuo
    Kubo, Michiaki
    Miyano, Satoru
    Nakamura, Yusuke
    Tsunoda, Tatsuhiko
    [J]. NATURE GENETICS, 2010, 42 (11) : 931 - U39
  • [10] A highly annotated whole-genome sequence of a Korean individual
    Kim, Jong-Il
    Ju, Young Seok
    Park, Hansoo
    Kim, Sheehyun
    Lee, Seonwook
    Yi, Jae-Hyuk
    Mudge, Joann
    Miller, Neil A.
    Hong, Dongwan
    Bell, Callum J.
    Kim, Hye-Sun
    Chung, In-Soon
    Lee, Woo-Chung
    Lee, Ji-Sun
    Seo, Seung-Hyun
    Yun, Ji-Young
    Woo, Hyun Nyun
    Lee, Heewook
    Suh, Dongwhan
    Lee, Seungbok
    Kim, Hyun-Jin
    Yavartanoo, Maryam
    Kwak, Minhye
    Zheng, Ying
    Lee, Mi Kyeong
    Park, Hyunjun
    Kim, Jeong Yeon
    Gokcumen, Omer
    Mills, Ryan E.
    Zaranek, Alexander Wait
    Thakuria, Joseph
    Wu, Xiaodi
    Kim, Ryan W.
    Huntley, Jim J.
    Luo, Shujun
    Schroth, Gary P.
    Wu, Thomas D.
    Kim, HyeRan
    Yang, Kap-Seok
    Park, Woong-Yang
    Kim, Hyungtae
    Church, George M.
    Lee, Charles
    Kingsmore, Stephen F.
    Seo, Jeong-Sun
    [J]. NATURE, 2009, 460 (7258) : 1011 - U96