A diploid assembly-based benchmark for variants in the major histocompatibility complex

被引:52
作者
Chin, Chen-Shan [1 ]
Wagner, Justin [2 ]
Zeng, Qiandong [3 ]
Garrison, Erik [4 ]
Garg, Shilpa [5 ]
Fungtammasan, Arkarachai [1 ]
Rautiainen, Mikko [6 ,7 ,8 ]
Aganezov, Sergey [9 ]
Kirsche, Melanie [9 ]
Zarate, Samantha [9 ]
Schatz, Michael C. [9 ,10 ]
Xiao, Chunlin [11 ]
Rowell, William J. [12 ]
Markello, Charles [4 ]
Farek, Jesse [13 ]
Sedlazeck, Fritz J. [13 ]
Bansal, Vikas [14 ]
Yoo, Byunggil [15 ]
Miller, Neil [15 ]
Zhou, Xin [16 ]
Carroll, Andrew [17 ]
Barrio, Alvaro Martinez [18 ]
Salit, Marc [19 ]
Marschall, Tobias [20 ]
Dilthey, Alexander T. [21 ]
Zook, Justin M. [2 ]
机构
[1] DNAnexus Inc, 1975 W El Camino Real,Suite 204, Mountain View, CA 94040 USA
[2] NIST, Mat Measurement Lab, 100 Bur Dr,MS8312, Gaithersburg, MD 20899 USA
[3] Lab Corp Amer Holdings, 3400 Comp Dr, Westborough, MA 01581 USA
[4] Univ Calif Santa Cruz, 1156 High St, Santa Cruz, CA 95064 USA
[5] Harvard Med Sch, Dept Genet, Boston, MA 02115 USA
[6] Saarland Univ, Ctr Bioinformat, Saarland Informat Campus E2-1, D-66123 Saarbrucken, Germany
[7] Max Planck Inst Informat, Saarland Informat Campus E2-4, D-66123 Saarbrucken, Germany
[8] Saarland Grad Sch Comp Sci, Saarland Informat Campus E2-3, D-66123 Saarbrucken, Germany
[9] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
[10] Cold Spring Harbor Lab, Simons Ctr Quantitat Biol, POB 100, Cold Spring Harbor, NY 11724 USA
[11] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, 8600 Rockville Pike, Bethesda, MD 20894 USA
[12] Pacific Biosci, Menlo Pk, CA 94025 USA
[13] Baylor Coll Med, Human Genome Sequencing Ctr, One Baylor Plaza, Houston, TX 77030 USA
[14] Univ Calif San Diego, Dept Pediat, La Jolla, CA 92093 USA
[15] Childrens Mercy Kansas City, Genom Med Ctr, Kansas City, MO 64108 USA
[16] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[17] Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA 94043 USA
[18] 10x Genom, Pleasanton, CA 94588 USA
[19] Joint Initiat Metrol Biol, Stanford, CA 94305 USA
[20] Heinrich Heine Univ Dusseldorf, Inst Med Biometry & Bioinformat, D-40225 Dusseldorf, Germany
[21] Heinrich Heine Univ Dusseldorf, Inst Med Microbiol & Hosp Hyg, D-40225 Dusseldorf, Germany
基金
美国国家卫生研究院;
关键词
MHC HAPLOTYPES; GENOME; RESOURCE;
D O I
10.1038/s41467-020-18564-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
070301 [无机化学]; 070403 [天体物理学]; 070507 [自然资源与国土空间规划学]; 090105 [作物生产系统与生态工程];
摘要
Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long reads and linked reads now enable us to construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - the Major Histocompatibility Complex (MHC). Here, we develop a human genome benchmark derived from a diploid assembly for the openly-consented Genome in a Bottle sample HG002. We assemble a single contig for each haplotype, align them to the reference, call phased small and structural variants, and define a small variant benchmark for the MHC, covering 94% of the MHC and 22368 variants smaller than 50bp, 49% more variants than a mapping-based benchmark. This benchmark reliably identifies errors in mapping-based callsets, and enables performance assessment in regions with much denser, complex variation than regions covered by previous benchmarks. Accurate, phased assemblies are a key tool in understanding the human genome, particularly in highly polymorphic regions like the medically important MHC. Here the authors provide an assembly-based benchmark for this difficult-to-characterize region.
引用
收藏
页数:9
相关论文
共 33 条
[1]
A public resource facilitating clinical use of genomes [J].
Ball, Madeleine P. ;
Thakuria, Joseph V. ;
Zaranek, Alexander Wait ;
Clegg, Tom ;
Rosenbaum, Abraham M. ;
Wu, Xiaodi ;
Angrist, Misha ;
Bhak, Jong ;
Bobe, Jason ;
Callow, Matthew J. ;
Cano, Carlos ;
Chou, Michael F. ;
Chung, Wendy K. ;
Douglas, Shawn M. ;
Estep, Preston W. ;
Gore, Athurva ;
Hulick, Peter ;
Labarga, Alberto ;
Lee, Je-Hyuk ;
Lunshof, Jeantine E. ;
Kim, Byung Chul ;
Kim, Jong-Il ;
Li, Zhe ;
Murray, Michael F. ;
Nilsen, Geoffrey B. ;
Peters, Brock A. ;
Raman, Anugraha M. ;
Rienhoff, Hugh Y. ;
Robasky, Kimberly ;
Wheeler, Matthew T. ;
Vandewege, Ward ;
Vorhaus, Daniel B. ;
Yang, Joyce L. ;
Yang, Luhan ;
Aach, John ;
Ashley, Euan A. ;
Drmanac, Radoje ;
Kim, Seong-Jin ;
Li, Jin Billy ;
Peshkin, Leonid ;
Seidman, Christine E. ;
Seo, Jeong-Sun ;
Zhang, Kun ;
Rehm, Heidi L. ;
Church, George M. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2012, 109 (30) :11920-11927
[2]
Matching for the nonconventional MHC-I MICA gene significantly reduces the incidence of acute and chronic GVHD [J].
Carapito, Raphael ;
Jung, Nicolas ;
Kwemou, Marius ;
Untrau, Meiggie ;
Michel, Sandra ;
Pichot, Angelique ;
Giacometti, Gaelle ;
Macquin, Cecile ;
Ilias, Wassila ;
Morlon, Aurore ;
Kotova, Irina ;
Apostolova, Petya ;
Schmitt-Graeff, Annette ;
Cesbron, Anne ;
Gagne, Katia ;
Oudshoorn, Machteld ;
van der Holt, Bronno ;
Labalette, Myriam ;
Spierings, Eric ;
Picard, Christophe ;
Loiseau, Pascale ;
Tamouza, Ryad ;
Toubert, Antoine ;
Parissiadis, Anne ;
Dubois, Valerie ;
Lafarge, Xavier ;
Maumy-Bertrand, Myriam ;
Bertrand, Frederic ;
Vago, Luca ;
Ciceri, Fabio ;
Paillard, Catherine ;
Querol, Sergi ;
Sierra, Jorge ;
Fleischhauer, Katharina ;
Nagler, Arnon ;
Labopin, Myriam ;
Inoko, Hidetoshi ;
von dem Borne, Peter A. ;
Kuball, Juergen ;
Ota, Masao ;
Katsuyama, Yoshihiko ;
Michallet, Mauricette ;
Lioure, Bruno ;
de latour, Regis Peffault ;
Blaise, Didier ;
Cornelissen, Jan J. ;
Yakoub-Agha, Ibrahim ;
Claas, Frans ;
Moreau, Philippe ;
Milpied, Noel .
BLOOD, 2016, 128 (15) :1979-1986
[3]
Multi-platform discovery of haplotype-resolved structural variation in human genomes [J].
Chaisson, Mark J. P. ;
Sanders, Ashley D. ;
Zhao, Xuefang ;
Malhotra, Ankit ;
Porubsky, David ;
Rausch, Tobias ;
Gardner, Eugene J. ;
Rodriguez, Oscar L. ;
Guo, Li ;
Collins, Ryan L. ;
Fan, Xian ;
Wen, Jia ;
Handsaker, Robert E. ;
Fairley, Susan ;
Kronenberg, Zev N. ;
Kong, Xiangmeng ;
Hormozdiari, Fereydoun ;
Lee, Dillon ;
Wenger, Aaron M. ;
Hastie, Alex R. ;
Antaki, Danny ;
Anantharaman, Thomas ;
Audano, Peter A. ;
Brand, Harrison ;
Cantsilieris, Stuart ;
Cao, Han ;
Cerveira, Eliza ;
Chen, Chong ;
Chen, Xintong ;
Chin, Chen-Shan ;
Chong, Zechen ;
Chuang, Nelson T. ;
Lambert, Christine C. ;
Church, Deanna M. ;
Clarke, Laura ;
Farrell, Andrew ;
Flores, Joey ;
Galeev, Timur ;
Gorkin, David U. ;
Gujral, Madhusudan ;
Guryev, Victor ;
Heaton, William Haynes ;
Korlach, Jonas ;
Kumar, Sushant ;
Kwon, Jee Young ;
Lam, Ernest T. ;
Lee, Jong Eun ;
Lee, Joyce ;
Lee, Wan-Ping ;
Lee, Sau Peng .
NATURE COMMUNICATIONS, 2019, 10 (1)
[4]
Chin C.-S., 2019, PREPRINT, DOI [10.1101/705616, DOI 10.1101/705616]
[5]
HLA*LA-HLA typing from linearly projected graph alignments [J].
Dilthey, Alexander T. ;
Mentzer, Alexander J. ;
Carapito, Raphael ;
Cutland, Clare ;
Cereb, Nezih ;
Madhi, Shabir A. ;
Rhie, Arang ;
Koren, Sergey ;
Bahram, Seiamak ;
McVean, Gil ;
Phillippy, Adam M. .
BIOINFORMATICS, 2019, 35 (21) :4394-4396
[6]
A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree [J].
Eberle, Michael A. ;
Fritzilas, Epameinondas ;
Krusche, Peter ;
Kallberg, Morten ;
Moore, Benjamin L. ;
Bekritsky, Mitchell A. ;
Iqbal, Zamin ;
Chuang, Han-Yu ;
Humphray, Sean J. ;
Halpern, Aaron L. ;
Kruglyak, Semyon ;
Margulies, Elliott H. ;
McVean, Gil ;
Bentley, David R. .
GENOME RESEARCH, 2017, 27 (01) :157-164
[7]
Haplotype-aware diplotyping from noisy long reads [J].
Ebler, Jana ;
Haukness, Marina ;
Pesout, Trevor ;
Marschall, Tobias ;
Paten, Benedict .
GENOME BIOLOGY, 2019, 20 (1)
[8]
Genetic Variation, Comparative Genomics, and the Diagnosis of Disease [J].
Eichler, Evan E. .
NEW ENGLAND JOURNAL OF MEDICINE, 2019, 381 (01) :64-74
[9]
A whole-genome association study of major determinants for host control of HIV-1 [J].
Fellay, Jacques ;
Shianna, Kevin V. ;
Ge, Dongliang ;
Colombo, Sara ;
Ledergerber, Bruno ;
Weale, Mike ;
Zhang, Kunlin ;
Gumbs, Curtis ;
Castagna, Antonella ;
Cossarizza, Andrea ;
Cozzi-Lepri, Alessandro ;
De Luca, Andrea ;
Easterbrook, Philippa ;
Francioli, Patrick ;
Mallal, Simon ;
Martinez-Picado, Javier ;
Miro, Jose M. ;
Obel, Niels ;
Smith, Jason P. ;
Wyniger, Josiane ;
Descombes, Patrick ;
Antonarakis, Stylianos E. ;
Letvin, Norman L. ;
McMichael, Andrew J. ;
Haynes, Barton F. ;
Telenti, Amalio ;
Goldstein, David B. .
SCIENCE, 2007, 317 (5840) :944-947
[10]
Gene map of the extended human MHC [J].
Roger Horton ;
Laurens Wilming ;
Vikki Rand ;
Ruth C. Lovering ;
Elspeth A. Bruford ;
Varsha K. Khodiyar ;
Michael J. Lush ;
Sue Povey ;
C. Conover Talbot ;
Mathew W. Wright ;
Hester M. Wain ;
John Trowsdale ;
Andreas Ziegler ;
Stephan Beck .
Nature Reviews Genetics, 2004, 5 (12) :889-899