Haplotype-resolved diverse human genomes and integrated analysis of structural variation

被引:424
作者
Ebert, Peter [1 ]
Audano, Peter A. [2 ]
Zhu, Qihui [3 ]
Rodriguez-Martin, Bernardo [4 ]
Porubsky, David [2 ]
Bonder, Marc Jan [4 ,5 ]
Sulovari, Arvis [2 ]
Ebler, Jana [1 ]
Zhou, Weichen [6 ]
Mari, Rebecca Serra [1 ]
Yilmaz, Feyza [3 ]
Zhao, Xuefang [7 ,8 ,9 ]
Hsieh, PingHsun [2 ]
Lee, Joyce [10 ]
Kumar, Sushant [11 ]
Lin, Jiadong [12 ]
Rausch, Tobias [4 ]
Chen, Yu [13 ,14 ]
Ren, Jingwen [15 ]
Santamarina, Martin [16 ,17 ]
Hops, Wolfram [4 ]
Ashraf, Hufsah [1 ]
Chuang, Nelson T. [18 ]
Yang, Xiaofei [19 ]
Munson, Katherine M. [2 ]
Lewis, Alexandra P. [2 ]
Fairley, Susan [20 ]
Tallon, Luke J. [18 ]
Clarke, Wayne E. [21 ]
Basile, Anna O. [21 ]
Byrska-Bishop, Marta [21 ]
Corvelo, Andre [21 ]
Evani, Uday S. [21 ]
Lu, Tsung-Yu [15 ]
Chaisson, Mark J. P. [15 ]
Chen, Junjie [22 ]
Li, Chong [22 ]
Brand, Harrison [7 ,8 ,9 ]
Wenger, Aaron M. [23 ]
Ghareghani, Maryam [1 ,24 ,25 ]
Harvey, William T. [2 ]
Raeder, Benjamin [4 ]
Hasenfeld, Patrick [4 ]
Regier, Allison A. [26 ]
Abel, Haley J. [26 ]
Hall, Ira M. [27 ]
Flicek, Paul [20 ]
Stegle, Oliver [4 ,5 ]
Gerstein, Mark B. [11 ]
Tubio, Jose M. C. [16 ,17 ]
机构
[1] Heinrich Heine Univ, Med Fac, Inst Med Biometry & Bioinformat, Moorenstr 20, D-40225 Dusseldorf, Germany
[2] Univ Washington, Dept Genome Sci, Sch Med, 3720 15th Ave NE, Seattle, WA 98195 USA
[3] Jackson Lab Genom Med, 10 Discovery Dr, Farmington, CT 06032 USA
[4] European Mol Biol Lab EMBL, Genome Biol Unit, Meyerhofstr 1, D-69117 Heidelberg, Germany
[5] German Canc Res Ctr, Div Computat Genom & Syst Genet, D-69120 Heidelberg, Germany
[6] Univ Michigan, Med Sch, Dept Computat Med & Bioinformat, 100 Washtenaw Ave, Ann Arbor, MI 48109 USA
[7] Harvard Med Sch, Massachusetts Gen Hosp, Ctr Genom Med, Dept Neurol, Boston, MA 02114 USA
[8] Broad Inst MIT & Harvard, Program Med & Populat Genet, Cambridge, MA 02142 USA
[9] Broad Inst MIT & Harvard, Stanley Ctr Psychiat Res, Cambridge, MA 02142 USA
[10] Bionano Genom, San Diego, CA 92121 USA
[11] Yale Univ, Program Computat Biol & Bioinformat, BASS 432 & 437,266 Whitney Ave, New Haven, CT 06520 USA
[12] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Automat Sci & Engn, Xian 710049, Shaanxi, Peoples R China
[13] Univ Alabama Birmingham, Sch Med, Dept Genet, Birmingham, AL 35294 USA
[14] Univ Alabama Birmingham, Sch Med, Informat Inst, Birmingham, AL 35294 USA
[15] Univ Southern Calif, Dept Quantitat & Computat Biol, Los Angeles, CA 90089 USA
[16] Univ Santiago de Compostela, Ctr Res Mol Med & Chron Dis CIMUS, Genomes & Dis, Santiago De Compostela, Spain
[17] Univ Santiago de Compostela, Dept Zool Genet & Phys Anthropol, Santiago De Compostela, Spain
[18] Univ Maryland, Sch Med, Inst Genome Sci, 670 W Baltimore St, Baltimore, MD 21201 USA
[19] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Comp Sci & Technol, Xian 710049, Shaanxi, Peoples R China
[20] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Cambridge CB10 1SD, England
[21] New York Genome Ctr, New York, NY 10013 USA
[22] Temple Univ, Dept Comp & Informat Sci, Philadelphia, PA 19122 USA
[23] Pacific Biosci Calif, Menlo Pk, CA 94025 USA
[24] Max Planck Inst Informat, Saarland Informat Campus E1-4, D-66123 Saarbrucken, Germany
[25] Saarland Univ, Saarbrucken Grad Sch Comp Sci, Saarland Informat Campus E1-3, D-66123 Saarbrucken, Germany
[26] Washington Univ, Dept Med, St Louis, MO 63108 USA
[27] Yale Sch Med, Dept Genet, 333 Cedar St, New Haven, CT 06510 USA
[28] Univ Chicago, Genet Genom & Syst Biol, Chicago, IL 60637 USA
[29] Univ Chicago, Dept Med, Sect Genet Med, Chicago, IL 60637 USA
[30] Univ Michigan, Dept Human Genet, 1241 E Catherine St, Ann Arbor, MI 48109 USA
[31] Xi An Jiao Tong Univ, Affiliated Hosp 1, Precis Med Ctr, 277 West Yanta Rd, Xian 710061, Shaanxi, Peoples R China
[32] Ewha Womans Univ, Dept Grad Studies Life Sci, Seoul 120750, South Korea
[33] Univ Washington, Howard Hughes Med Inst, Seattle, WA 98195 USA
基金
美国国家卫生研究院; 欧洲研究理事会; 美国国家科学基金会; 英国惠康基金;
关键词
MULTIPLE SEQUENCE ALIGNMENT; GENE-EXPRESSION; TRANSPOSABLE ELEMENTS; TANDEM REPEATS; SVA ELEMENTS; DISCOVERY; RETROTRANSPOSONS; EVOLUTION; ANCESTRY; VARIANTS;
D O I
10.1126/science.abf7117
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.
引用
收藏
页码:48 / +
页数:220
相关论文
共 199 条
[1]
Mapping and characterization of structural variation in 17,795 human genomes [J].
Abel, Haley J. ;
Larson, David E. ;
Regier, Allison A. ;
Chiang, Colby ;
Das, Indraniel ;
Kanchi, Krishna L. ;
Layer, Ryan M. ;
Neale, Benjamin M. ;
Salerno, William J. ;
Reeves, Catherine ;
Buyske, Steven ;
Matise, Tara C. ;
Muzny, Donna M. ;
Zody, Michael C. ;
Lander, Eric S. ;
Dutcher, Susan K. ;
Stitziel, Nathan O. ;
Hall, Ira M. .
NATURE, 2020, 583 (7814) :83-+
[2]
CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing [J].
Abyzov, Alexej ;
Urban, Alexander E. ;
Snyder, Michael ;
Gerstein, Mark .
GENOME RESEARCH, 2011, 21 (06) :974-984
[3]
The GTEx Consortium atlas of genetic regulatory effects across human tissues [J].
Aguet, Francois ;
Barbeira, Alvaro N. ;
Bonazzola, Rodrigo ;
Brown, Andrew ;
Castel, Stephane E. ;
Jo, Brian ;
Kasela, Silva ;
Kim-Hellmuth, Sarah ;
Liang, Yanyu ;
Parsana, Princy ;
Flynn, Elise ;
Fresard, Laure ;
Gamazon, Eric R. ;
Hamel, Andrew R. ;
He, Yuan ;
Hormozdiari, Farhad ;
Mohammadi, Pejman ;
Munoz-Aguirre, Manuel ;
Ardlie, Kristin G. ;
Battle, Alexis ;
Bonazzola, Rodrigo ;
Brown, Christopher D. ;
Cox, Nancy ;
Dermitzakis, Emmanouil T. ;
Engelhardt, Barbara E. ;
Garrido-Martin, Diego ;
Gay, Nicole R. ;
Getz, Gad ;
Guigo, Roderic ;
Hamel, Andrew R. ;
Handsaker, Robert E. ;
He, Yuan ;
Hoffman, Paul J. ;
Hormozdiari, Farhad ;
Im, Hae Kyung ;
Jo, Brian ;
Kasela, Silva ;
Kashin, Seva ;
Kim-Hellmuth, Sarah ;
Kwong, Alan ;
Lappalainen, Tuuli ;
Li, Xiao ;
Liang, Yanyu ;
MacArthur, Daniel G. ;
Mohammadi, Pejman ;
Montgomery, Stephen B. ;
Munoz-Aguirre, Manuel ;
Rouhana, John M. ;
Hormozdiari, Farhad ;
Im, Hae Kyung .
SCIENCE, 2020, 369 (6509) :1318-1330
[4]
Fast model-based estimation of ancestry in unrelated individuals [J].
Alexander, David H. ;
Novembre, John ;
Lange, Kenneth .
GENOME RESEARCH, 2009, 19 (09) :1655-1664
[5]
A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[6]
A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[7]
Anantharaman TS, 2005, PACIFIC SYMPOSIUM ON BIOCOMPUTING 2005, P385
[8]
Andrews, 2010, FASTQC QUALITY CONTR
[9]
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[10]
Audano P. A., 2020, HGSVC KEY CALLSET RE, DOI [10.5281/zenodo.4268828, DOI 10.5281/ZENODO.4268828]