A general framework for estimating the relative pathogenicity of human genetic variants

被引:4447
作者
Kircher, Martin
Witten, Daniela M. [2 ]
Jain, Preti [3 ]
O'Roak, Brian J. [1 ]
Cooper, Gregory M. [3 ]
Shendure, Jay [1 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
[3] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
DE-NOVO MUTATIONS; DATABASE; CONSTRAINT; CHROMATIN; ELEMENTS; NETWORK;
D O I
10.1038/ng.2892
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Current methods for annotating and interpreting human genetic variation tend to exploit a single information type (for example, conservation) and/or are restricted in scope (for example, to missense changes). Here we describe Combined Annotation-Dependent Depletion (CADD), a method for objectively integrating many diverse annotations into a single measure (C score) for each variant. We implement CADD as a support vector machine trained to differentiate 14.7 million high-frequency human-derived alleles from 14.7 million simulated variants. We precompute C scores for all 8.6 billion possible human single-nucleotide variants and enable scoring of short insertions-deletions. C scores correlate with allelic diversity, annotations of functionality, pathogenicity, disease severity, experimentally measured regulatory effects and complex trait associations, and they highly rank known pathogenic variants within individual genomes. The ability of CADD to prioritize functional, deleterious and pathogenic variants across many functional categories, effect sizes and genetic architectures is unmatched by any current single-annotation method.
引用
收藏
页码:310 / +
页数:8
相关论文
共 59 条
[1]   A method and server for predicting damaging missense mutations [J].
Adzhubei, Ivan A. ;
Schmidt, Steffen ;
Peshkin, Leonid ;
Ramensky, Vasily E. ;
Gerasimova, Anna ;
Bork, Peer ;
Kondrashov, Alexey S. ;
Sunyaev, Shamil R. .
NATURE METHODS, 2010, 7 (04) :248-249
[2]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[3]   Genome-wide inference of natural selection on human transcription factor binding sites [J].
Arbiza, Leonardo ;
Gronau, Ilan ;
Aksoy, Bulent A. ;
Hubisz, Melissa J. ;
Gulko, Brad ;
Keinan, Alon ;
Siepel, Adam .
NATURE GENETICS, 2013, 45 (07) :723-+
[4]   One-stop shop for disease genes [J].
Baker, Monya .
NATURE, 2012, 491 (7423) :171-171
[5]   High-resolution mapping and characterization of open chromatin across the genome [J].
Boyle, Alan P. ;
Davis, Sean ;
Shulha, Hennady P. ;
Meltzer, Paul ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Furey, Terrence S. ;
Crawford, Gregory E. .
CELL, 2008, 132 (02) :311-322
[6]   Distribution and intensity of constraint in mammalian genomic sequence [J].
Cooper, GM ;
Stone, EA ;
Asimenos, G ;
Green, ED ;
Batzoglou, S ;
Sidow, A .
GENOME RESEARCH, 2005, 15 (07) :901-913
[7]   A copy number variation morbidity map of developmental delay [J].
Cooper, Gregory M. ;
Coe, Bradley P. ;
Girirajan, Santhosh ;
Rosenfeld, Jill A. ;
Vu, Tiffany H. ;
Baker, Carl ;
Williams, Charles ;
Stalker, Heather ;
Hamid, Rizwan ;
Hannig, Vickie ;
Abdel-Hamid, Hoda ;
Bader, Patricia ;
McCracken, Elizabeth ;
Niyazov, Dmitriy ;
Leppig, Kathleen ;
Thiese, Heidi ;
Hummel, Marybeth ;
Alexander, Nora ;
Gorski, Jerome ;
Kussmann, Jennifer ;
Shashi, Vandana ;
Johnson, Krys ;
Rehder, Catherine ;
Ballif, Blake C. ;
Shaffer, Lisa G. ;
Eichler, Evan E. .
NATURE GENETICS, 2011, 43 (09) :838-U44
[8]   Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data [J].
Cooper, Gregory M. ;
Shendure, Jay .
NATURE REVIEWS GENETICS, 2011, 12 (09) :628-640
[9]   Single-nucleotide evolutionary constraint scores highlight disease-causing mutations [J].
Cooper, Gregory M. ;
Goode, David L. ;
Ng, Sarah B. ;
Sidow, Arend ;
Bamshad, Michael J. ;
Shendure, Jay ;
Nickerson, Deborah A. .
NATURE METHODS, 2010, 7 (04) :250-251
[10]   Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP plus [J].
Davydov, Eugene V. ;
Goode, David L. ;
Sirota, Marina ;
Cooper, Gregory M. ;
Sidow, Arend ;
Batzoglou, Serafim .
PLOS COMPUTATIONAL BIOLOGY, 2010, 6 (12)