The diploid genome sequence of an individual human

被引:1154
作者
Levy, Samuel [1 ]
Sutton, Granger
Ng, Pauline C.
Feuk, Lars
Halpern, Aaron L.
Walenz, Brian P.
Axelrod, Nelson
Huang, Jiaqi
Kirkness, Ewen F.
Denisov, Gennady
Lin, Yuan
MacDonald, Jeffrey R.
Pang, Andy Wing Chun
Shago, Mary
Stockwell, Timothy B.
Tsiamouri, Alexia
Bafna, Vineet
Bansal, Vikas
Kravitz, Saul A.
Busam, Dana A.
Beeson, Karen Y.
Mclntosh, Tina C.
Remington, Karin A.
Abril, Josep F.
Gill, John
Borman, Jon
Rogers, Yu-Hui
Frazier, Marvin E.
Scherer, Stephen W.
Strausberg, Robert L.
Venter, J. Craig
机构
[1] J Craig Venter Inst, Rockville, MD USA
[2] Univ Toronto, Hosp Sick Children, Program Genet & Genom Biol, Toronto, ON M5G 1X8, Canada
[3] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[4] Univ Barcelona, Fac Biol, Dept Genet, Barcelona, Catalonia, Spain
关键词
D O I
10.1371/journal.pbio.0050254
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Presented here is a genome sequence of an individual human. It was produced from; 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb) of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel) included 3,213,401 single nucleotide polymorphisms (SNPs), 53,823 block substitutions (2-206 bp), 292,102 heterozygous insertion/deletion events (indels)(1-571 bp), 559,473 homozygous indels (1-82,711 bp), 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.
引用
收藏
页码:2113 / 2144
页数:32
相关论文
共 106 条
  • [1] A haplotype map of the human genome
    Altshuler, D
    Brooks, LD
    Chakravarti, A
    Collins, FS
    Daly, MJ
    Donnelly, P
    Gibbs, RA
    Belmont, JW
    Boudreau, A
    Leal, SM
    Hardenbol, P
    Pasternak, S
    Wheeler, DA
    Willis, TD
    Yu, FL
    Yang, HM
    Zeng, CQ
    Gao, Y
    Hu, HR
    Hu, WT
    Li, CH
    Lin, W
    Liu, SQ
    Pan, H
    Tang, XL
    Wang, J
    Wang, W
    Yu, J
    Zhang, B
    Zhang, QR
    Zhao, HB
    Zhao, H
    Zhou, J
    Gabriel, SB
    Barry, R
    Blumenstiel, B
    Camargo, A
    Defelice, M
    Faggart, M
    Goyette, M
    Gupta, S
    Moore, J
    Nguyen, H
    Onofrio, RC
    Parkin, M
    Roy, J
    Stahl, E
    Winchester, E
    Ziaugra, L
    Shen, Y
    [J]. NATURE, 2005, 437 (7063) : 1299 - 1320
  • [2] [Anonymous], 1989, Molecular Cloning
  • [3] Association between a functional variant of the KLOTHO gene and high-density lipoprotein cholesterol, blood pressure, stroke, and longevity
    Arking, DE
    Atzmon, G
    Arking, A
    Barzilai, N
    Dietz, HC
    [J]. CIRCULATION RESEARCH, 2005, 96 (04) : 412 - 418
  • [4] Polynomial and APX-hard cases of the individual haplotyping problem
    Bafna, V
    Istrail, S
    Lancia, G
    Rizzi, R
    [J]. THEORETICAL COMPUTER SCIENCE, 2005, 335 (01) : 109 - 125
  • [5] Recent segmental duplications in the human genome
    Bailey, JA
    Gu, ZP
    Clark, RA
    Reinert, K
    Samonte, RV
    Schwartz, S
    Adams, MD
    Myers, EW
    Li, PW
    Eichler, EE
    [J]. SCIENCE, 2002, 297 (5583) : 1003 - 1007
  • [6] Microdeletions and microinsertions causing human genetic disease: Common mechanisms of mutagenesis and the role of local DNA sequence complexity
    Ball, EV
    Stenson, PD
    Abeysinghe, SS
    Krawczak, M
    Cooper, DN
    Chuzhanova, NA
    [J]. HUMAN MUTATION, 2005, 26 (03) : 205 - 213
  • [7] Alu repeats and human genomic diversity
    Batzer, MA
    Deininger, PL
    [J]. NATURE REVIEWS GENETICS, 2002, 3 (05) : 370 - 379
  • [8] Molecular characterization of the human peroxisomal branched-chain acyl-CoA oxidase: cDNA cloning, chromosomal assignment, tissue distribution, and evidence for the absence of the protein in Zellweger syndrome
    Baumgart, E
    Vanhooren, JCT
    Fransen, M
    Marynen, P
    Puype, M
    Vandekerckhove, J
    Leunissen, JAM
    Fahimi, HD
    Mannaerts, GP
    VanVeldhoven, PP
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1996, 93 (24) : 13748 - 13753
  • [9] Tandem repeats finder: a program to analyze DNA sequences
    Benson, G
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (02) : 573 - 580
  • [10] Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes
    Bhangale, TR
    Rieder, MJ
    Livingston, RJ
    Nickerson, DA
    [J]. HUMAN MOLECULAR GENETICS, 2005, 14 (01) : 59 - 69