Prioritizing causal disease genes using unbiased genomic features

被引:37
作者
Deo, Rahul C. [1 ,2 ,3 ,4 ,5 ]
Musso, Gabriel [5 ,6 ]
Tasan, Murat [5 ,7 ,8 ,9 ,10 ]
Tang, Paul [3 ]
Poon, Annie [3 ]
Yuan, Christiana [1 ]
Felix, Janine F. [11 ]
Vasan, Ramachandran S. [12 ,13 ,14 ,15 ]
Beroukhim, Rameen [6 ,16 ,17 ]
De Marco, Teresa [2 ]
Kwok, Pui-Yan [1 ,3 ]
MacRae, Calum A. [5 ]
Roth, Frederick P. [5 ,7 ,8 ,9 ,10 ,18 ,19 ,20 ]
机构
[1] Univ Calif San Francisco, Cardiovasc Res Inst, San Francisco, CA 94158 USA
[2] Univ Calif San Francisco, Dept Med, San Francisco, CA 94143 USA
[3] Univ Calif San Francisco, Inst Human Genet, San Francisco, CA 94158 USA
[4] Calif Inst Quantitat Biosci, San Francisco, CA 94143 USA
[5] Harvard Med Sch, Dept Biol Chem & Mol Pharmacol, Boston, MA 02115 USA
[6] Harvard Med Sch, Brigham & Womens Hosp, Dept Med, Boston, MA 02115 USA
[7] Univ Toronto, Donnelly Ctr, Toronto, ON M5G 1X5, Canada
[8] Univ Toronto, Dept Mol Genet, Toronto, ON M5G 1X5, Canada
[9] Univ Toronto, Dept Comp Sci, Toronto, ON M5G 1X5, Canada
[10] Mt Sinai Hosp, Lunenfeld Res Inst, Toronto, ON M5G 1X5, Canada
[11] Erasmus Univ, Med Ctr, Dept Epidemiol, POB 2040, NL-3000 CA Rotterdam, Netherlands
[12] Boston Univ, Sch Med, Prevent Med Sect, Boston, MA 02118 USA
[13] Boston Univ, Sch Med, Cardiol Sect, Boston, MA 02118 USA
[14] Boston Univ, Sch Med, Dept Med, Boston, MA 02118 USA
[15] Boston Univ, Sch Med, Framingham Heart Study, Framingham, MA 01702 USA
[16] Dana Farber Canc Inst, Ctr Canc Genome Discovery, Boston, MA 02215 USA
[17] Dana Farber Canc Inst, Dept Canc Biol, Boston, MA 02215 USA
[18] Dana Farber Canc Inst, CCSB, Boston, MA 02215 USA
[19] Dana Farber Canc Inst, Dept Canc Biol, Boston, MA 02215 USA
[20] Canadian Inst Adv Res, Toronto, ON M5G 1Z8, Canada
来源
GENOME BIOLOGY | 2014年 / 15卷
关键词
WIDE ASSOCIATION; HEART-FAILURE; DILATED CARDIOMYOPATHY; COMMON VARIANTS; DATABASE; LOCI; RESOURCE; MUTATION; IDENTIFICATION; METAANALYSIS;
D O I
10.1186/s13059-014-0534-8
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Cardiovascular disease (CVD) is the leading cause of death in the developed world. Human genetic studies, including genome-wide sequencing and SNP-array approaches, promise to reveal disease genes and mechanisms representing new therapeutic targets. In practice, however, identification of the actual genes contributing to disease pathogenesis has lagged behind identification of associated loci, thus limiting the clinical benefits. Results: To aid in localizing causal genes, we develop a machine learning approach, Objective Prioritization for Enhanced Novelty (OPEN), which quantitatively prioritizes gene-disease associations based on a diverse group of genomic features. This approach uses only unbiased predictive features and thus is not hampered by a preference towards previously well-characterized genes. We demonstrate success in identifying genetic determinants for CVD-related traits, including cholesterol levels, blood pressure, and conduction system and cardiomyopathy phenotypes. Using OPEN, we prioritize genes, including FLNC, for association with increased left ventricular diameter, which is a defining feature of a prevalent cardiovascular disorder, dilated cardiomyopathy or DCM. Using a zebrafish model, we experimentally validate FLNC and identify a novel FLNC splice-site mutation in a patient with severe DCM. Conclusion: Our approach stands to assist interpretation of large-scale genetic studies without compromising their fundamentally unbiased nature.
引用
收藏
页数:19
相关论文
共 76 条
  • [1] Gene prioritization through genomic data fusion
    Aerts, S
    Lambrechts, D
    Maity, S
    Van Loo, P
    Coessens, B
    De Smet, F
    Tranchevent, LC
    De Moor, B
    Marynen, P
    Hassan, B
    Carmeliet, P
    Moreau, Y
    [J]. NATURE BIOTECHNOLOGY, 2006, 24 (05) : 537 - 544
  • [2] Hundreds of variants clustered in genomic loci and biological pathways affect human height
    Allen, Hana Lango
    Estrada, Karol
    Lettre, Guillaume
    Berndt, Sonja I.
    Weedon, Michael N.
    Rivadeneira, Fernando
    Willer, Cristen J.
    Jackson, Anne U.
    Vedantam, Sailaja
    Raychaudhuri, Soumya
    Ferreira, Teresa
    Wood, Andrew R.
    Weyant, Robert J.
    Segre, Ayellet V.
    Speliotes, Elizabeth K.
    Wheeler, Eleanor
    Soranzo, Nicole
    Park, Ju-Hyun
    Yang, Jian
    Gudbjartsson, Daniel
    Heard-Costa, Nancy L.
    Randall, Joshua C.
    Qi, Lu
    Smith, Albert Vernon
    Maegi, Reedik
    Pastinen, Tomi
    Liang, Liming
    Heid, Iris M.
    Luan, Jian'an
    Thorleifsson, Gudmar
    Winkler, Thomas W.
    Goddard, Michael E.
    Lo, Ken Sin
    Palmer, Cameron
    Workalemahu, Tsegaselassie
    Aulchenko, Yurii S.
    Johansson, Asa
    Zillikens, M. Carola
    Feitosa, Mary F.
    Esko, Tonu
    Johnson, Toby
    Ketkar, Shamika
    Kraft, Peter
    Mangino, Massimo
    Prokopenko, Inga
    Absher, Devin
    Albrecht, Eva
    Ernst, Florian
    Glazer, Nicole L.
    Hayward, Caroline
    [J]. NATURE, 2010, 467 (7317) : 832 - 838
  • [3] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [4] Genetic Mapping in Human Disease
    Altshuler, David
    Daly, Mark J.
    Lander, Eric S.
    [J]. SCIENCE, 2008, 322 (5903) : 881 - 888
  • [5] [Anonymous], 2009, Netflix prize documentation
  • [6] The InterPro database, an integrated documentation resource for protein families, domains and functional sites
    Apweiler, R
    Attwood, TK
    Bairoch, A
    Bateman, A
    Birney, E
    Biswas, M
    Bucher, P
    Cerutti, T
    Corpet, F
    Croning, MDR
    Durbin, R
    Falquet, L
    Fleischmann, W
    Gouzy, J
    Hermjakob, H
    Hulo, N
    Jonassen, I
    Kahn, D
    Kanapin, A
    Karavidopoulou, Y
    Lopez, R
    Marx, B
    Mulder, NJ
    Oinn, TM
    Pagni, M
    Servant, F
    Sigrist, CJA
    Zdobnov, EM
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 37 - 40
  • [7] Assessing fracture risk using gradient boosting machine (GBM) models
    Atkinson, Elizabeth J.
    Therneau, Terry M.
    Melton, L. Joseph, III
    Camp, Jon J.
    Achenbach, Sara J.
    Amin, Shreyasee
    Khosla, Sundeep
    [J]. JOURNAL OF BONE AND MINERAL RESEARCH, 2012, 27 (06) : 1397 - 1404
  • [8] Deciphering the splicing code
    Barash, Yoseph
    Calarco, John A.
    Gao, Weijun
    Pan, Qun
    Wang, Xinchen
    Shai, Ofer
    Blencowe, Benjamin J.
    Frey, Brendan J.
    [J]. NATURE, 2010, 465 (7294) : 53 - 59
  • [9] NCBI GEO: archive for functional genomics data sets-update
    Barrett, Tanya
    Wilhite, Stephen E.
    Ledoux, Pierre
    Evangelista, Carlos
    Kim, Irene F.
    Tomashevsky, Maxim
    Marshall, Kimberly A.
    Phillippy, Katherine H.
    Sherman, Patti M.
    Holko, Michelle
    Yefanov, Andrey
    Lee, Hyeseung
    Zhang, Naigong
    Robertson, Cynthia L.
    Serova, Nadezhda
    Davis, Sean
    Soboleva, Alexandra
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) : D991 - D995
  • [10] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300