REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

被引:1532
作者
Ioannidis, Nilah M. [1 ,2 ]
Rothstein, Joseph H. [2 ,3 ,4 ]
Pejaver, Vikas [5 ]
Middha, Sumit [6 ]
McDonnell, Shannon K. [7 ]
Baheti, Saurabh [7 ]
Musolf, Anthony [8 ]
Li, Qing [8 ]
Holzinger, Emily [8 ]
Karyadi, Danielle [9 ]
Cannon-Albright, Lisa A. [10 ]
Teerlink, Craig C. [10 ]
Stanford, Janet L. [11 ]
Isaacs, William B. [12 ]
Xu, Jianfeng [13 ]
Cooney, Kathleen A. [10 ,14 ,15 ]
Lange, Ethan M. [16 ]
Schleutker, Johanna [17 ,18 ]
Carpten, John D. [19 ]
Powell, Isaac J. [20 ]
Cussenot, Olivier [21 ]
Cancel-Tassin, Geraldine [21 ]
Giles, Graham G. [22 ,23 ]
MacInnis, Robert J. [22 ,23 ]
Maier, Christiane [24 ,25 ]
Hsieh, Chih-Lin [26 ]
Wiklund, Fredrik [27 ]
Catalona, William J. [28 ]
Foulkes, William D. [29 ,30 ]
Mandal, Diptasri [31 ]
Eeles, Rosalind A. [32 ]
Kote-Jarai, Zsofia [32 ]
Bustamante, Carlos D. [1 ,33 ]
Schaid, Daniel J. [7 ]
Hastie, Trevor [33 ,34 ]
Ostrander, Elaine A. [9 ]
Bailey-Wilson, Joan E. [8 ]
Radivojac, Predrag [5 ]
Thibodeau, Stephen N. [35 ]
Whittemore, Alice S. [2 ,33 ]
Sieh, Weiva [2 ,3 ,4 ]
机构
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
[2] Stanford Univ, Dept Hlth Res & Policy, Stanford, CA 94305 USA
[3] Icahn Sch Med Mt Sinai, Dept Genet & Genom Sci, New York, NY 10029 USA
[4] Icahn Sch Med Mt Sinai, Dept Populat Hlth Sci & Policy, New York, NY 10029 USA
[5] Indiana Univ, Dept Comp Sci & Informat, Bloomington, IN 47405 USA
[6] Mem Sloan Kettering Canc Ctr, Dept Pathol, New York, NY 10065 USA
[7] Mayo Clin, Dept Hlth Sci Res, Rochester, MN 55905 USA
[8] NHGRI, Computat & Stat Genom Branch, Baltimore, MD 21224 USA
[9] NHGRI, Canc Genet & Comparat Genom Branch, Bethesda, MD 20892 USA
[10] Univ Utah, Sch Med, Dept Internal Med, Salt Lake City, UT 84108 USA
[11] Fred Hutchinson Canc Res Ctr, Div Publ Hlth Sci, Seattle, WA 98109 USA
[12] Johns Hopkins Univ, Sch Med, Brady Urol Inst, Sidney Kimmel Comprehens Canc Ctr, Baltimore, MD 21287 USA
[13] NorthShore Univ HealthSyst Res Inst, Evanston, IL 60201 USA
[14] Univ Michigan, Sch Med, Dept Internal Med, Ann Arbor, MI 48109 USA
[15] Univ Michigan, Sch Med, Dept Urol, Ann Arbor, MI 48109 USA
[16] Univ North Carolina Chapel Hill, Dept Genet, Chapel Hill, NC 27599 USA
[17] Univ Turku, Dept Med Biochem & Genet, Turku 20014, Finland
[18] Turku Univ Hosp, Dept Med Genet Tyks Microbiol & Genet, FIN-20520 Turku, Finland
[19] Translat Genom Res Inst, Integrated Canc Genom Div, Phoenix, AZ 85004 USA
[20] Wayne State Univ, Dept Urol, Detroit, MI 48201 USA
[21] Univ Paris, Ctr Rech Pathol Prostat & Urol, F-75013 Paris, France
[22] Canc Council Victoria, Cancer Epidemiol Ctr, Melbourne, Vic 3004, Australia
[23] Univ Melbourne, Ctr Epidemiol & Biostat, Melbourne, Vic 3010, Australia
[24] Univ Hosp Ulm, Inst Human Genet, D-89075 Ulm, Germany
[25] Univ Hosp Ulm, Dept Urol, D-89075 Ulm, Germany
[26] Univ Southern Calif, Dept Urol, Los Angeles, CA 90033 USA
[27] Karolinska Inst, Dept Med Epidemiol & Biostat, S-17177 Stockholm, Sweden
[28] Northwestern Univ, Dept Urol, Feinberg Sch Med, Chicago, IL 60611 USA
[29] Montreal Gen Hosp, Dept Oncol, Montreal, PQ H3G 1A4, Canada
[30] Montreal Gen Hosp, Dept Human Genet, Montreal, PQ H3G 1A4, Canada
[31] Louisiana State Univ, Hlth Sci Ctr, Dept Genet, New Orleans, LA 70112 USA
[32] Inst Canc Res, Div Genet & Epidemiol, Sutton SM2 5NG, Surrey, England
[33] Stanford Univ, Dept Biomed Data Sci, Stanford, CA 94305 USA
[34] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[35] Mayo Clin, Dept Lab Med & Pathol, Rochester, MN 55905 USA
关键词
CLINVAR PUBLIC ARCHIVE; NONSYNONYMOUS SNVS; FUNCTIONAL IMPACT; SEQUENCE VARIANTS; DISEASE; CONSENSUS; FRAMEWORK; CONSEQUENCES; ANNOTATION; MUTATIONS;
D O I
10.1016/j.ajhg.2016.08.016
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10(-12)) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046-0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027-0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale.
引用
收藏
页码:877 / 885
页数:9
相关论文
共 53 条
[21]   GENCODE: The reference human genome annotation for The ENCODE Project [J].
Harrow, Jennifer ;
Frankish, Adam ;
Gonzalez, Jose M. ;
Tapanari, Electra ;
Diekhans, Mark ;
Kokocinski, Felix ;
Aken, Bronwen L. ;
Barrell, Daniel ;
Zadissa, Amonida ;
Searle, Stephen ;
Barnes, If ;
Bignell, Alexandra ;
Boychenko, Veronika ;
Hunt, Toby ;
Kay, Mike ;
Mukherjee, Gaurab ;
Rajan, Jeena ;
Despacio-Reyes, Gloria ;
Saunders, Gary ;
Steward, Charles ;
Harte, Rachel ;
Lin, Michael ;
Howald, Cedric ;
Tanzer, Andrea ;
Derrien, Thomas ;
Chrast, Jacqueline ;
Walters, Nathalie ;
Balasubramanian, Suganthi ;
Pei, Baikang ;
Tress, Michael ;
Manuel Rodriguez, Jose ;
Ezkurdia, Iakes ;
van Baren, Jeltje ;
Brent, Michael ;
Haussler, David ;
Kellis, Manolis ;
Valencia, Alfonso ;
Reymond, Alexandre ;
Gerstein, Mark ;
Guigo, Roderic ;
Hubbard, Tim J. .
GENOME RESEARCH, 2012, 22 (09) :1760-1774
[22]  
Hastie T., 2009, The Elements of Statistical Learning: Data Mining, Inference and Prediction, V2, P1
[23]   Selective constraint, background selection, and mutation accumulation variability within and between human populations [J].
Hodgkinson, Alan ;
Casals, Ferran ;
Idaghdour, Youssef ;
Grenier, Jean-Christophe ;
Hernandez, Ryan D. ;
Awadalla, Philip .
BMC GENOMICS, 2013, 14
[24]   A spectral approach integrating functional genomic annotations for coding and noncoding variants [J].
Ionita-Laza, Iuliana ;
McCallum, Kenneth ;
Xu, Bin ;
Buxbaum, Joseph D. .
NATURE GENETICS, 2016, 48 (02) :214-220
[25]   A general framework for estimating the relative pathogenicity of human genetic variants [J].
Kircher, Martin ;
Witten, Daniela M. ;
Jain, Preti ;
O'Roak, Brian J. ;
Cooper, Gregory M. ;
Shendure, Jay .
NATURE GENETICS, 2014, 46 (03) :310-+
[26]   Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm [J].
Kumar, Prateek ;
Henikoff, Steven ;
Ng, Pauline C. .
NATURE PROTOCOLS, 2009, 4 (07) :1073-1082
[27]   ClinVar: public archive of interpretations of clinically relevant variants [J].
Landrum, Melissa J. ;
Lee, Jennifer M. ;
Benson, Mark ;
Brown, Garth ;
Chao, Chen ;
Chitipiralla, Shanmuga ;
Gu, Baoshan ;
Hart, Jennifer ;
Hoffman, Douglas ;
Hoover, Jeffrey ;
Jang, Wonhee ;
Katz, Kenneth ;
Ovetsky, Michael ;
Riley, George ;
Sethi, Amanjeev ;
Tully, Ray ;
Villamarin-Salomon, Ricardo ;
Rubinstein, Wendy ;
Maglott, Donna R. .
NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) :D862-D868
[28]   ClinVar: public archive of relationships among sequence variation and human phenotype [J].
Landrum, Melissa J. ;
Lee, Jennifer M. ;
Riley, George R. ;
Jang, Wonhee ;
Rubinstein, Wendy S. ;
Church, Deanna M. ;
Maglott, Donna R. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D980-D985
[29]   Automated inference of molecular mechanisms of disease from amino acid substitutions [J].
Li, Biao ;
Krishnan, Vidhya G. ;
Mort, Matthew E. ;
Xin, Fuxiao ;
Kamati, Kishore K. ;
Cooper, David N. ;
Mooney, Sean D. ;
Radivojac, Predrag .
BIOINFORMATICS, 2009, 25 (21) :2744-2750
[30]   Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies [J].
Li, Miao-Xin ;
Kwan, Johnny S. H. ;
Bao, Su-Ying ;
Yang, Wanling ;
Ho, Shu-Leong ;
Song, Yong-Qiang ;
Sham, Pak C. .
PLOS GENETICS, 2013, 9 (01)