A map of human genome variation from population-scale sequencing

被引:5728
作者
Altshuler, David
Durbin, Richard M.
Abecasis, Goncalo R. [4 ]
Bentley, David R. [5 ]
Chakravarti, Aravinda [6 ]
Clark, Andrew G. [7 ]
Collins, Francis S.
De la Vega, Francisco M. [8 ]
Donnelly, Peter [9 ]
Egholm, Michael [10 ]
Flicek, Paul [11 ]
Gabriel, Stacey B. [1 ]
Gibbs, Richard A. [12 ]
Knoppers, Bartha M. [13 ]
Lander, Eric S. [1 ]
Lehrach, Hans [14 ]
Mardis, Elaine R. [15 ]
McVean, Gil A. [9 ,16 ]
Nickerson, DebbieA. [17 ]
Peltonen, Leena
Schafer, Alan J. [18 ]
Sherry, Stephen T. [19 ]
Wang, Jun [20 ]
Wilson, Richard K. [15 ]
Gibbs, Richard A. [12 ]
Deiros, David [12 ]
Metzker, Mike [12 ]
Muzny, Donna [12 ]
Reid, Jeff [12 ]
Wheeler, David
Wang, Jun [20 ]
Li, Jingxiang [20 ]
Jian, Min [20 ]
Li, Guoqing [20 ]
Li, Ruiqiang [20 ]
Liang, Huiqing [20 ]
Tian, Geng [20 ]
Wang, Bo [20 ]
Wang, Jian [20 ]
Wang, Wei [20 ]
Yang, Huanming [20 ]
Zhang, Xiuqing [20 ]
Zheng, Huisong [20 ]
Lander, Eric S. [1 ]
Altshuler, David L. [1 ,3 ,32 ,33 ]
Ambrogio, Lauren [1 ]
Bloom, Toby [1 ]
Cibulskis, Kristian [1 ]
Fennell, Tim J. [1 ]
Gabriel, Stacey B. [1 ]
机构
[1] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[2] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[3] Harvard Univ, Sch Med, Dept Genet, Cambridge, MA 02115 USA
[4] Univ Michigan, Ctr Stat Genet & Biostat, Ann Arbor, MI 48109 USA
[5] Illumina Cambridge Ltd, Saffron Walden CB10 1XL, Essex, England
[6] Johns Hopkins Univ, Sch Med, McKusick Nathans Inst Genet Med, Baltimore, MD 21205 USA
[7] Cornell Univ, Ctr Comparative & Populat Genom, Ithaca, NY 14850 USA
[8] Life Technol, Foster City, CA 94404 USA
[9] Wellcome Trust Ctr Human Genet, Oxford OX3 7BN, England
[10] Pall Corp, Port Washington, NY 11050 USA
[11] European Bioinformat Inst, Cambridge CB10 1SD, England
[12] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[13] McGill Univ, Ctr Genom & Policy, Montreal, PQ H3A 1A4, Canada
[14] Max Planck Inst Mol Genet, D-14195 Berlin, Germany
[15] Washington Univ, Sch Med, Genome Ctr, St Louis, MO 63108 USA
[16] Univ Oxford, Dept Stat, Oxford OX1 3TG, England
[17] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[18] Wellcome Trust Res Labs, London NW1 2BE, England
[19] US Natl Inst Hlth, Natl Ctr Biotechnol Informat, Bethesda, MD 20892 USA
[20] BGI Shenzhen, Shenzhen 518083, Peoples R China
[21] Life Technol, Beverly, MA 01915 USA
[22] Biotechnol Ctr TU Dresden, Deep Sequencing Grp, D-01307 Dresden, Germany
[23] Univ Kiel, Inst Clin Mol Biol, D-24105 Kiel, Germany
[24] Roche Appl Sci, Branford, CT 06405 USA
[25] Univ Helsinki, Dept Med Genet, Inst Mol Med FIMM, FIN-00290 Helsinki, Finland
[26] Helsinki Univ Hosp, Helsinki 00290, Finland
[27] Agilent Technol, Santa Clara, CA 95051 USA
[28] Boston Coll, Dept Biol, Chestnut Hill, MA 02467 USA
[29] US Natl Inst Hlth, Natl Inst Environm Hlth Sci, Res Triangle Pk, NC 27709 USA
[30] Univ Virginia, Sch Med, Dept Biochem & Mol Genet, Charlottesville, VA 22908 USA
[31] Illumina, San Diego, CA 92121 USA
[32] Brigham & Womens Hosp, Dept Pathol, Boston, MA 02115 USA
[33] Harvard Univ, Sch Med, Boston, MA 02115 USA
[34] Univ Washington, Dept Med, Div Gen Med, Seattle, WA 98195 USA
[35] Harvard Univ, Ctr Syst Biol, Dept Organism & Evolut Biol, Cambridge, MA 02138 USA
[36] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[37] Cardiff Univ, Inst Med Genet, Cardiff CF14 4XN, Wales
[38] Univ Calif San Diego, Dept Psychiat, La Jolla, CA 92093 USA
[39] Mt Sinai Sch Med, Seaver Autism Ctr, New York, NY 10029 USA
[40] Mt Sinai Sch Med, Dept Psychiat, New York, NY 10029 USA
[41] Albert Einstein Coll Med, Dept Epidemiol & Populat Hlth, Bronx, NY 10461 USA
[42] Mt Sinai Sch Med, Dept Genet & Genom Sci, New York, NY 10029 USA
[43] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[44] Univ Arizona, Dept Mol & Cellular Biol, Tucson, AZ 85721 USA
[45] Genome Biol Res Unit, European Mol Biol Lab, D-69117 Heidelberg, Germany
[46] Leiden Univ Med Ctr, Mol Epidemiol Sect, NL-2333 ZA Leiden, Netherlands
[47] Louisiana State Univ, Dept Biol Sci, Baton Rouge, LA 70803 USA
[48] Translat Genom Res Inst, Phoenix, AZ 85004 USA
[49] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
[50] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
基金
瑞士国家科学基金会; 英国医学研究理事会; 英国工程与自然科学研究理事会; 中国国家自然科学基金; 英国惠康基金;
关键词
WIDE ASSOCIATION; RARE VARIANTS; GENE; NUCLEOTIDE; PRDM9; IMPUTATION; MOTIF;
D O I
10.1038/nature09534
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
引用
收藏
页码:1061 / 1073
页数:13
相关论文
共 50 条
  • [31] Genotype imputation for genome-wide association studies
    Marchini, Jonathan
    Howie, Bryan
    [J]. NATURE REVIEWS GENETICS, 2010, 11 (07) : 499 - 511
  • [32] An initial map of insertion and deletion (INDEL) variation in the human genome
    Mills, Ryan E.
    Luttig, Christopher T.
    Larkins, Christine E.
    Beauchamp, Adam
    Tsui, Circe
    Pittard, W. Stephen
    Devine, Scott E.
    [J]. GENOME RESEARCH, 2006, 16 (09) : 1182 - 1190
  • [33] MUSUNURU K, 2003, N ENGL J ME IN PRESS
  • [34] A common sequence motif associated with recombination hot spots and genome instability in humans
    Myers, Simon
    Freeman, Colin
    Auton, Adam
    Donnelly, Peter
    McVean, Gil
    [J]. NATURE GENETICS, 2008, 40 (09) : 1124 - 1129
  • [35] Drive Against Hotspot Motifs in Primates Implicates the PRDM9 Gene in Meiotic Recombination
    Myers, Simon
    Bowden, Rory
    Tumian, Afidalina
    Bontrop, Ronald E.
    Freeman, Colin
    MacFie, Tammie S.
    McVean, Gil
    Donnelly, Peter
    [J]. SCIENCE, 2010, 327 (5967) : 876 - 879
  • [36] Nachman MW, 2000, GENETICS, V156, P297
  • [37] Rare Variants of IFIH1, a Gene Implicated in Antiviral Responses, Protect Against Type 1 Diabetes
    Nejentsev, Sergey
    Walker, Neil
    Riches, David
    Egholm, Michael
    Todd, John A.
    [J]. SCIENCE, 2009, 324 (5925) : 387 - 389
  • [38] *NHLBI PROGR GEN A, 2010, SEATTLESNPS
  • [39] Prdm9 Controls Activation of Mammalian Recombination Hotspots
    Parvanov, Emil D.
    Petkov, Petko M.
    Paigen, Kenneth
    [J]. SCIENCE, 2010, 327 (5967) : 835 - 835
  • [40] Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing
    Roach, Jared C.
    Glusman, Gustavo
    Smit, Arian F. A.
    Huff, Chad D.
    Hubley, Robert
    Shannon, Paul T.
    Rowen, Lee
    Pant, Krishna P.
    Goodman, Nathan
    Bamshad, Michael
    Shendure, Jay
    Drmanac, Radoje
    Jorde, Lynn B.
    Hood, Leroy
    Galas, David J.
    [J]. SCIENCE, 2010, 328 (5978) : 636 - 639