Genotype and SNP calling from next-generation sequencing data

被引:952
作者
Nielsen, Rasmus [1 ,2 ,3 ]
Paul, Joshua S. [4 ]
Albrechtsen, Anders [2 ]
Song, Yun S. [3 ,4 ]
机构
[1] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
[2] Univ Copenhagen, Ctr Bioinformat, DK-2100 Copenhagen O, Denmark
[3] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[4] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
基金
美国国家科学基金会;
关键词
GENOME-WIDE ASSOCIATION; SHORT READ ALIGNMENT; HIGH-THROUGHPUT; REVEALS; IMPUTATION; INFERENCE; ULTRAFAST; GRAPHS; ERROR; PHASE;
D O I
10.1038/nrg2986
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Meaningful analysis of next-generation sequencing (NGS) data, which are produced extensively by genetics and genomics studies, relies crucially on the accurate calling of SNPs and genotypes. Recently developed statistical methods both improve and quantify the considerable uncertainty associated with genotype calling, and will especially benefit the growing number of studies using low-to medium-coverage data. We review these methods and provide a guide for their use in NGS studies.
引用
收藏
页码:443 / 451
页数:9
相关论文
共 53 条
  • [1] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [2] Quality scores and SNP detection in sequencing-by-synthesis systems
    Brockman, William
    Alvarez, Pablo
    Young, Sarah
    Garber, Manuel
    Giannoukos, Georgia
    Lee, William L.
    Russ, Carsten
    Lander, Eric S.
    Nusbaum, Chad
    Jaffe, David B.
    [J]. GENOME RESEARCH, 2008, 18 (05) : 763 - 770
  • [3] Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering
    Browning, Sharon R.
    Browning, Brian L.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) : 1084 - 1097
  • [4] BURROWS M, 1994, HP LABS TECHNICAL RE, V124
  • [5] ALLPATHS: De novo assembly of whole-genome shotgun microreads
    Butler, Jonathan
    MacCallum, Iain
    Kleber, Michael
    Shlyakhter, Ilya A.
    Belmonte, Matthew K.
    Lander, Eric S.
    Nusbaum, Chad
    Jaffe, David B.
    [J]. GENOME RESEARCH, 2008, 18 (05) : 810 - 820
  • [6] De novo fragment assembly with short mate-paired reads: Does the read length matter?
    Chaisson, Mark J.
    Brinza, Dumitru
    Pevzner, Pavel A.
    [J]. GENOME RESEARCH, 2009, 19 (02) : 336 - 346
  • [7] Imputation methods to improve inference in SNP association studies
    Dai, James Y.
    Ruczinski, Ingo
    LeBlanc, Michael
    Kooperberg, Charles
    [J]. GENETIC EPIDEMIOLOGY, 2006, 30 (08) : 690 - 702
  • [8] A framework for variation discovery and genotyping using next-generation DNA sequencing data
    DePristo, Mark A.
    Banks, Eric
    Poplin, Ryan
    Garimella, Kiran V.
    Maguire, Jared R.
    Hartl, Christopher
    Philippakis, Anthony A.
    del Angel, Guillermo
    Rivas, Manuel A.
    Hanna, Matt
    McKenna, Aaron
    Fennell, Tim J.
    Kernytsky, Andrew M.
    Sivachenko, Andrey Y.
    Cibulskis, Kristian
    Gabriel, Stacey B.
    Altshuler, David
    Daly, Mark J.
    [J]. NATURE GENETICS, 2011, 43 (05) : 491 - +
  • [9] Base-calling of automated sequencer traces using phred.: II.: Error probabilities
    Ewing, B
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 186 - 194
  • [10] Flicek P, 2009, NAT METHODS, V6, pS6, DOI [10.1038/NMETH.1376, 10.1038/nmeth.1376]