A map of human genome variation from population-scale sequencing

被引:5728
作者
Altshuler, David
Durbin, Richard M.
Abecasis, Goncalo R. [4 ]
Bentley, David R. [5 ]
Chakravarti, Aravinda [6 ]
Clark, Andrew G. [7 ]
Collins, Francis S.
De la Vega, Francisco M. [8 ]
Donnelly, Peter [9 ]
Egholm, Michael [10 ]
Flicek, Paul [11 ]
Gabriel, Stacey B. [1 ]
Gibbs, Richard A. [12 ]
Knoppers, Bartha M. [13 ]
Lander, Eric S. [1 ]
Lehrach, Hans [14 ]
Mardis, Elaine R. [15 ]
McVean, Gil A. [9 ,16 ]
Nickerson, DebbieA. [17 ]
Peltonen, Leena
Schafer, Alan J. [18 ]
Sherry, Stephen T. [19 ]
Wang, Jun [20 ]
Wilson, Richard K. [15 ]
Gibbs, Richard A. [12 ]
Deiros, David [12 ]
Metzker, Mike [12 ]
Muzny, Donna [12 ]
Reid, Jeff [12 ]
Wheeler, David
Wang, Jun [20 ]
Li, Jingxiang [20 ]
Jian, Min [20 ]
Li, Guoqing [20 ]
Li, Ruiqiang [20 ]
Liang, Huiqing [20 ]
Tian, Geng [20 ]
Wang, Bo [20 ]
Wang, Jian [20 ]
Wang, Wei [20 ]
Yang, Huanming [20 ]
Zhang, Xiuqing [20 ]
Zheng, Huisong [20 ]
Lander, Eric S. [1 ]
Altshuler, David L. [1 ,3 ,32 ,33 ]
Ambrogio, Lauren [1 ]
Bloom, Toby [1 ]
Cibulskis, Kristian [1 ]
Fennell, Tim J. [1 ]
Gabriel, Stacey B. [1 ]
机构
[1] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[2] Wellcome Trust Sanger Inst, Cambridge CB10 1SA, England
[3] Harvard Univ, Sch Med, Dept Genet, Cambridge, MA 02115 USA
[4] Univ Michigan, Ctr Stat Genet & Biostat, Ann Arbor, MI 48109 USA
[5] Illumina Cambridge Ltd, Saffron Walden CB10 1XL, Essex, England
[6] Johns Hopkins Univ, Sch Med, McKusick Nathans Inst Genet Med, Baltimore, MD 21205 USA
[7] Cornell Univ, Ctr Comparative & Populat Genom, Ithaca, NY 14850 USA
[8] Life Technol, Foster City, CA 94404 USA
[9] Wellcome Trust Ctr Human Genet, Oxford OX3 7BN, England
[10] Pall Corp, Port Washington, NY 11050 USA
[11] European Bioinformat Inst, Cambridge CB10 1SD, England
[12] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[13] McGill Univ, Ctr Genom & Policy, Montreal, PQ H3A 1A4, Canada
[14] Max Planck Inst Mol Genet, D-14195 Berlin, Germany
[15] Washington Univ, Sch Med, Genome Ctr, St Louis, MO 63108 USA
[16] Univ Oxford, Dept Stat, Oxford OX1 3TG, England
[17] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[18] Wellcome Trust Res Labs, London NW1 2BE, England
[19] US Natl Inst Hlth, Natl Ctr Biotechnol Informat, Bethesda, MD 20892 USA
[20] BGI Shenzhen, Shenzhen 518083, Peoples R China
[21] Life Technol, Beverly, MA 01915 USA
[22] Biotechnol Ctr TU Dresden, Deep Sequencing Grp, D-01307 Dresden, Germany
[23] Univ Kiel, Inst Clin Mol Biol, D-24105 Kiel, Germany
[24] Roche Appl Sci, Branford, CT 06405 USA
[25] Univ Helsinki, Dept Med Genet, Inst Mol Med FIMM, FIN-00290 Helsinki, Finland
[26] Helsinki Univ Hosp, Helsinki 00290, Finland
[27] Agilent Technol, Santa Clara, CA 95051 USA
[28] Boston Coll, Dept Biol, Chestnut Hill, MA 02467 USA
[29] US Natl Inst Hlth, Natl Inst Environm Hlth Sci, Res Triangle Pk, NC 27709 USA
[30] Univ Virginia, Sch Med, Dept Biochem & Mol Genet, Charlottesville, VA 22908 USA
[31] Illumina, San Diego, CA 92121 USA
[32] Brigham & Womens Hosp, Dept Pathol, Boston, MA 02115 USA
[33] Harvard Univ, Sch Med, Boston, MA 02115 USA
[34] Univ Washington, Dept Med, Div Gen Med, Seattle, WA 98195 USA
[35] Harvard Univ, Ctr Syst Biol, Dept Organism & Evolut Biol, Cambridge, MA 02138 USA
[36] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[37] Cardiff Univ, Inst Med Genet, Cardiff CF14 4XN, Wales
[38] Univ Calif San Diego, Dept Psychiat, La Jolla, CA 92093 USA
[39] Mt Sinai Sch Med, Seaver Autism Ctr, New York, NY 10029 USA
[40] Mt Sinai Sch Med, Dept Psychiat, New York, NY 10029 USA
[41] Albert Einstein Coll Med, Dept Epidemiol & Populat Hlth, Bronx, NY 10461 USA
[42] Mt Sinai Sch Med, Dept Genet & Genom Sci, New York, NY 10029 USA
[43] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[44] Univ Arizona, Dept Mol & Cellular Biol, Tucson, AZ 85721 USA
[45] Genome Biol Res Unit, European Mol Biol Lab, D-69117 Heidelberg, Germany
[46] Leiden Univ Med Ctr, Mol Epidemiol Sect, NL-2333 ZA Leiden, Netherlands
[47] Louisiana State Univ, Dept Biol Sci, Baton Rouge, LA 70803 USA
[48] Translat Genom Res Inst, Phoenix, AZ 85004 USA
[49] Univ Calif Santa Cruz, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
[50] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
基金
瑞士国家科学基金会; 英国医学研究理事会; 英国工程与自然科学研究理事会; 中国国家自然科学基金; 英国惠康基金;
关键词
WIDE ASSOCIATION; RARE VARIANTS; GENE; NUCLEOTIDE; PRDM9; IMPUTATION; MOTIF;
D O I
10.1038/nature09534
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
引用
收藏
页码:1061 / 1073
页数:13
相关论文
共 50 条
  • [41] A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms
    Sachidanandam, R
    Weissman, D
    Schmidt, SC
    Kakol, JM
    Stein, LD
    Marth, G
    Sherry, S
    Mullikin, JC
    Mortimore, BJ
    Willey, DL
    Hunt, SE
    Cole, CG
    Coggill, PC
    Rice, CM
    Ning, ZM
    Rogers, J
    Bentley, DR
    Kwok, PY
    Mardis, ER
    Yeh, RT
    Schultz, B
    Cook, L
    Davenport, R
    Dante, M
    Fulton, L
    Hillier, L
    Waterston, RH
    McPherson, JD
    Gilman, B
    Schaffner, S
    Van Etten, WJ
    Reich, D
    Higgins, J
    Daly, MJ
    Blumenstiel, B
    Baldwin, J
    Stange-Thomann, NS
    Zody, MC
    Linton, L
    Lander, ES
    Altshuler, D
    [J]. NATURE, 2001, 409 (6822) : 928 - 933
  • [42] Variants within the immunoregulatory CBLB gene are associated with multiple sclerosis
    Sanna, Serena
    Pitzalis, Maristella
    Zoledziewska, Magdalena
    Zara, Ilenia
    Sidore, Carlo
    Murru, Raffaele
    Whalen, Michael B.
    Busonero, Fabio
    Maschio, Andrea
    Costa, Gianna
    Melis, Maria Cristina
    Deidda, Francesca
    Poddie, Fausto
    Morelli, Laura
    Farina, Gabriele
    Li, Yun
    Dei, Mariano
    Lai, Sandra
    Mulas, Antonella
    Cuccuru, Gianmauro
    Porcu, Eleonora
    Liang, Liming
    Zavattari, Patrizia
    Moi, Loredana
    Deriu, Elisa
    Urru, M. Francesca
    Bajorek, Michele
    Satta, Maria Anna
    Cocco, Eleonora
    Ferrigno, Paola
    Sotgiu, Stefano
    Pugliatti, Maura
    Traccis, Sebastiano
    Angius, Andrea
    Melis, Maurizio
    Rosati, Giulio
    Abecasis, Goncalo R.
    Uda, Manuela
    Marrosu, Maria Giovanna
    Schlessinger, David
    Cucca, Francesco
    [J]. NATURE GENETICS, 2010, 42 (06) : 495 - 497
  • [43] Smith JM, 2007, GENET RES, V89, P391, DOI [10.1017/S0016672308009579, 10.1017/S0016672300014634]
  • [44] Population genomics of human gene expression
    Stranger, Barbara E.
    Nica, Alexandra C.
    Forrest, Matthew S.
    Dimas, Antigone
    Bird, Christine P.
    Beazley, Claude
    Ingle, Catherine E.
    Dunning, Mark
    Flicek, Paul
    Koller, Daphne
    Montgomery, Stephen
    Tavare, Simon
    Deloukas, Panos
    Dermitzakis, Emmanouil T.
    [J]. NATURE GENETICS, 2007, 39 (10) : 1217 - 1224
  • [45] DISRUPTION OF A GATA MOTIF IN THE DUFFY GENE PROMOTER ABOLISHES ERYTHROID GENE-EXPRESSION IN DUFFY NEGATIVE INDIVIDUALS
    TOURNAMILLE, C
    COLIN, Y
    CARTRON, JP
    LEVANKIM, C
    [J]. NATURE GENETICS, 1995, 10 (02) : 224 - 228
  • [46] Voight BF, 2006, PLOS BIOL, V4, P446, DOI 10.1371/journal.pbio.0040072
  • [47] The diploid genome sequence of an Asian individual
    Wang, Jun
    Wang, Wei
    Li, Ruiqiang
    Li, Yingrui
    Tian, Geng
    Goodman, Laurie
    Fan, Wei
    Zhang, Junqing
    Li, Jun
    Zhang, Juanbin
    Guo, Yiran
    Feng, Binxiao
    Li, Heng
    Lu, Yao
    Fang, Xiaodong
    Liang, Huiqing
    Du, Zhenglin
    Li, Dong
    Zhao, Yiqing
    Hu, Yujie
    Yang, Zhenzhen
    Zheng, Hancheng
    Hellmann, Ines
    Inouye, Michael
    Pool, John
    Yi, Xin
    Zhao, Jing
    Duan, Jinjie
    Zhou, Yan
    Qin, Junjie
    Ma, Lijia
    Li, Guoqing
    Yang, Zhentao
    Zhang, Guojie
    Yang, Bin
    Yu, Chang
    Liang, Fang
    Li, Wenjie
    Li, Shaochuan
    Li, Dawei
    Ni, Peixiang
    Ruan, Jue
    Li, Qibin
    Zhu, Hongmei
    Liu, Dongyuan
    Lu, Zhike
    Li, Ning
    Guo, Guangwu
    Zhang, Jianguo
    Ye, Jia
    [J]. NATURE, 2008, 456 (7218) : 60 - U1
  • [48] The theory of discovering rare variants via DNA sequencing
    Wendl, Michael C.
    Wilson, Richard K.
    [J]. BMC GENOMICS, 2009, 10 : 485
  • [49] The complete genome of an individual by massively parallel DNA sequencing
    Wheeler, David A.
    Srinivasan, Maithreyan
    Egholm, Michael
    Shen, Yufeng
    Chen, Lei
    McGuire, Amy
    He, Wen
    Chen, Yi-Ju
    Makhijani, Vinod
    Roth, G. Thomas
    Gomes, Xavier
    Tartaro, Karrie
    Niazi, Faheem
    Turcotte, Cynthia L.
    Irzyk, Gerard P.
    Lupski, James R.
    Chinault, Craig
    Song, Xing-zhi
    Liu, Yue
    Yuan, Ye
    Nazareth, Lynne
    Qin, Xiang
    Muzny, Donna M.
    Margulies, Marcel
    Weinstock, George M.
    Gibbs, Richard A.
    Rothberg, Jonathan M.
    [J]. NATURE, 2008, 452 (7189) : 872 - U5
  • [50] Mobile elements create structural variation: Analysis of a complete human genome
    Xing, Jinchuan
    Zhang, Yuhua
    Han, Kyudong
    Salem, Abdel Halim
    Sen, Shurjo K.
    Huff, Chad D.
    Zhou, Qiong
    Kirkness, Ewen F.
    Levy, Samuel
    Batzer, Mark A.
    Jorde, Lynn B.
    [J]. GENOME RESEARCH, 2009, 19 (09) : 1516 - 1526