Multi-platform discovery of haplotype-resolved structural variation in human genomes

被引:581
作者
Chaisson, Mark J. P. [1 ,2 ]
Sanders, Ashley D. [3 ]
Zhao, Xuefang [4 ,5 ]
Malhotra, Ankit [6 ]
Porubsky, David [7 ,8 ,9 ]
Rausch, Tobias [3 ]
Gardner, Eugene J. [10 ]
Rodriguez, Oscar L. [11 ]
Guo, Li [12 ,13 ,14 ]
Collins, Ryan L. [5 ,15 ]
Fan, Xian [16 ]
Wen, Jia [17 ]
Handsaker, Robert E. [18 ,19 ,20 ]
Fairley, Susan [21 ]
Kronenberg, Zev N. [1 ]
Kong, Xiangmeng [22 ,23 ]
Hormozdiari, Fereydoun [24 ,25 ]
Lee, Dillon [26 ,27 ]
Wenger, Aaron M. [28 ]
Hastie, Alex R. [29 ]
Antaki, Danny [30 ]
Anantharaman, Thomas [29 ]
Audano, Peter A. [1 ]
Brand, Harrison [5 ]
Cantsilieris, Stuart [1 ]
Cao, Han [29 ]
Cerveira, Eliza [6 ]
Chen, Chong [16 ]
Chen, Xintong [10 ]
Chin, Chen-Shan [28 ]
Chong, Zechen [16 ]
Chuang, Nelson T. [10 ]
Lambert, Christine C. [28 ]
Church, Deanna M. [31 ]
Clarke, Laura [21 ]
Farrell, Andrew [26 ,27 ]
Flores, Joey [32 ]
Galeev, Timur [22 ,23 ]
Gorkin, David U. [33 ,34 ]
Gujral, Madhusudan [30 ]
Guryev, Victor [7 ]
Heaton, William Haynes [31 ]
Korlach, Jonas [28 ]
Kumar, Sushant [22 ,23 ]
Kwon, Jee Young [6 ,35 ]
Lam, Ernest T. [29 ]
Lee, Jong Eun [36 ]
Lee, Joyce [29 ]
Lee, Wan-Ping [6 ]
Lee, Sau Peng [37 ]
机构
[1] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[2] Univ Southern Calif, Quantitat & Computat Biol, Los Angeles, CA 90089 USA
[3] European Mol Biol Lab, Genome Biol Unit, D-69117 Heidelberg, Germany
[4] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
[5] Harvard Med Sch, Dept Neurol, Massachusetts Gen Hosp, Ctr Genom Med, Boston, MA 02114 USA
[6] Jackson Lab Genom Med, Farmington, CT 06032 USA
[7] Univ Groningen, Univ Med Ctr Groningen, European Res Inst Biol Ageing, NL-9713 AV Groningen, Netherlands
[8] Saarland Univ, Ctr Bioinformat, D-66123 Saarbrucken, Germany
[9] Max Planck Inst Informat, D-66123 Saarbrucken, Germany
[10] Univ Maryland, Sch Med, Inst Genome Sci, Baltimore, MD 21201 USA
[11] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY 10029 USA
[12] Xi An Jiao Tong Univ, Sch Life Sci & Technol, Xian 710049, Shaanxi, Peoples R China
[13] Xi An Jiao Tong Univ, MOE Key Lab Intelligent Networks & Networks Secur, Sch Elect & Informat Engn, Xian 710049, Shaanxi, Peoples R China
[14] Xi An Jiao Tong Univ, Ye Lab Omics & Omics Informat, Xian 710049, Shaanxi, Peoples R China
[15] Harvard Med Sch, Program Bioinformat & Integrat Genom, Boston, MA 02115 USA
[16] Univ Texas MD Anderson Canc Ctr, Dept Bioinformat & Computat Biol, Houston, TX 77030 USA
[17] Univ N Carolina, Dept Bioinformat & Genom, Coll Comp & Informat, Charlotte, NC 28223 USA
[18] Harvard Med Sch, Dept Genet, Boston, MA 02115 USA
[19] Broad Inst MIT & Harvard, Stanley Ctr Psychiat Res, Cambridge, MA 02142 USA
[20] Broad Inst MIT & Harvard, Program Med & Populat Genet, Cambridge, MA 02142 USA
[21] European Mol Biol Lab, European Bioinformat Inst, Wellcome Genome Campus, Cambridge CB10 1SD, England
[22] Yale Univ, Sch Med, Computat Biol & Bioinformat Program, New Haven, CT 06520 USA
[23] Yale Univ, Dept Mol Biophys & Biochem, 266 Whitney Ave, New Haven, CT 06520 USA
[24] Univ Calif Davis, Biochem & Mol Med, Davis, CA 95616 USA
[25] Univ Calif Davis, UC Davis Genome Ctr, Davis, CA 95616 USA
[26] Univ Utah, Sch Med, USTAR Ctr Genet Discovery, Salt Lake City, UT 84112 USA
[27] Univ Utah, Dept Human Genet, Sch Med, Salt Lake City, UT 84112 USA
[28] Pacific Biosci, Menlo Pk, CA 94025 USA
[29] Bionano Genom, San Diego, CA 92121 USA
[30] Univ Calif San Diego, Beyster Ctr Genom Psychiat Dis, Dept Psychiat, La Jolla, CA 92093 USA
[31] 10X Genom, Pleasanton, CA 94566 USA
[32] Illumina Inc, Illumina Clin Serv Lab, 5200 Illumina Way, San Diego, CA 92122 USA
[33] Univ Calif San Diego, Dept Cellular & Mol Med, La Jolla, CA 92093 USA
[34] Ludwig Inst Canc Res, La Jolla, CA 92093 USA
[35] Ewha Womans Univ, Dept Grad Studies Life Sci, 52 Ewhayeodae Gil, Seoul 03760, South Korea
[36] DNA Link, Seoul, South Korea
[37] TreeCode Sdn Bhd, Bandar Bot, Klang 41200, Malaysia
[38] Univ Calif San Diego, Bioinformat & Syst Biol Grad Program, La Jolla, CA 92093 USA
[39] Drexel Univ, Sch Biomed Engn, Philadelphia, PA 19104 USA
[40] Univ Texas Hlth Sci Ctr Houston, Ctr Human Genet, Sch Publ Hlth, Houston, TX 77225 USA
[41] Washington Univ, Sch Med, Dept Med, McDonnell Genome Inst,Siteman Canc Ctr, St Louis, MO 63108 USA
[42] Univ Malaya, High Impact Res, Kuala Lumpur 50603, Malaysia
[43] Yale Univ, Dept Comp Sci, 266 Whitney Ave, New Haven, CT 06520 USA
[44] Yale Univ, Dept Stat & Data Sci, 266 Whitney Ave, New Haven, CT 06520 USA
[45] Univ Calif San Francisco, Inst Human Genet, San Francisco, CA 94143 USA
[46] BC Canc Agcy, Terry Fox Lab, Vancouver, BC V5Z 1L3, Canada
[47] Univ British Columbia, Dept Med Genet, Vancouver, BC V6T 1Z4, Canada
[48] Univ Calif San Diego, Dept Pediat, La Jolla, CA 92093 USA
[49] Xi An Jiao Tong Univ, Affiliated Hosp 1, Xian 710061, Shaanxi, Peoples R China
[50] Broad Inst MIT & Harvard, Ctr Mendelian Genom, Cambridge, MA 02142 USA
基金
国家重点研发计划; 英国医学研究理事会; 美国国家科学基金会; 美国国家卫生研究院; 英国惠康基金;
关键词
COPY-NUMBER VARIATION; SEGMENTAL DUPLICATIONS; GENETIC-VARIATION; SPERM CELLS; MICRODELETION; ACCURATE; MAPS; RETROTRANSPOSITION; RECOMBINATION; DIVERSITY;
D O I
10.1038/s41467-018-08148-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (>= 50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.
引用
收藏
页数:16
相关论文
共 64 条
[1]
A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[2]
Characterization of six human disease-associated inversion polymorphisms [J].
Antonacci, Francesca ;
Kidd, Jeffrey M. ;
Marques-Bonet, Tomas ;
Ventura, Mario ;
Siswara, Priscillia ;
Jiang, Zhaoshi ;
Eichler, Evan E. .
HUMAN MOLECULAR GENETICS, 2009, 18 (14) :2555-2566
[3]
Primate segmental duplications: crucibles of evolution, diversity and disease [J].
Bailey, Jeffrey A. ;
Eichler, Evan E. .
NATURE REVIEWS GENETICS, 2006, 7 (07) :552-564
[4]
Extending partial haplotypes to full genome haplotypes using chromosome conformation capture data [J].
Ben-Elazar, Shay ;
Chor, Benny ;
Yakhini, Zohar .
BIOINFORMATICS, 2016, 32 (17) :559-566
[5]
Assembling large genomes with single-molecule sequencing and locality-sensitive hashing [J].
Berlin, Konstantin ;
Koren, Sergey ;
Chin, Chen-Shan ;
Drake, James P. ;
Landolin, Jane M. ;
Phillippy, Adam M. .
NATURE BIOTECHNOLOGY, 2015, 33 (06) :623-+
[6]
Comprehensive human genetic maps: Individual and sex-specific variation in recombination [J].
Broman, KW ;
Murray, JC ;
Sheffield, VC ;
White, RL ;
Weber, JL .
AMERICAN JOURNAL OF HUMAN GENETICS, 1998, 63 (03) :861-869
[7]
Hot L1s account for the bulk of retrotransposition in the human population [J].
Brouha, B ;
Schustak, J ;
Badge, RM ;
Lutz-Prigget, S ;
Farley, AH ;
Moran, JV ;
Kazazian, HH .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (09) :5280-5285
[8]
Evidence consistent with human L1 retrotransposition in maternal meiosis I [J].
Brouha, B ;
Meischl, C ;
Ostertag, E ;
de Boer, M ;
Zhang, Y ;
Neijens, H ;
Roos, D ;
Kazazian, HH .
AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 71 (02) :327-336
[9]
Resolving Multicopy Duplications de novo Using Polyploid Phasing [J].
Chaisson, Mark J. ;
Mukherjee, Sudipto ;
Kannan, Sreeram ;
Eichler, Evan E. .
RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB 2017, 2017, 10229 :117-133
[10]
Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory [J].
Chaisson, Mark J. ;
Tesler, Glenn .
BMC BIOINFORMATICS, 2012, 13