Strategies and issues in the detection of pathway enrichment in genome-wide association studies

被引:95
作者
Hong, Mun-Gwan [1 ]
Pawitan, Yudi [1 ]
Magnusson, Patrik K. E. [1 ]
Prince, Jonathan A. [1 ]
机构
[1] Karolinska Inst, Dept Med Epidemiol & Biostat, S-17177 Stockholm, Sweden
基金
英国医学研究理事会; 美国国家卫生研究院;
关键词
COMPLEX HUMAN TRAITS; SUSCEPTIBILITY LOCI; GENE; DISEASE; RISK; PRIORITIZATION; POLYMORPHISM; REPLICATION; ANNOTATION; EXPRESSION;
D O I
10.1007/s00439-009-0676-z
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
A fundamental question in human genetics is the degree to which the polygenic character of complex traits derives from polymorphism in genes with similar or with dissimilar functions. The many genome-wide association studies now being performed offer an opportunity to investigate this, and although early attempts are emerging, new tools and modeling strategies still need to be developed and deployed. Towards this goal, we implemented a new algorithm to facilitate the transition from genetic marker lists (principally those generated by PLINK) to pathway analyses of representational gene sets in either threshold or threshold-free downstream applications (e.g. DAVID, GSEA-P, and Ingenuity Pathway Analysis). This was applied to several large genome-wide association studies covering diverse human traits that included type 2 diabetes, Crohn's disease, and plasma lipid levels. Validation of this approach was obtained for plasma HDL levels, where functional categories related to lipid metabolism emerged as the most significant in two independent studies. From analyses of these samples, we highlight and address numerous issues related to this strategy, including appropriate gene based correction statistics, the utility of imputed versus non-imputed marker sets, and the apparent enrichment of pathways due solely to the positional clustering of functionally related genes. The latter in particular emphasizes the importance of studies that directly tie genetic variation to functional characteristics of specific genes. The software freely provided that we have called ProxyGeneLD may resolve an important bottleneck in pathway-based analyses of genome-wide association data. This has allowed us to identify at least one replicable case of pathway enrichment but also to highlight functional gene clustering as a potentially serious problem that may lead to spurious pathway findings if not corrected.
引用
收藏
页码:289 / 301
页数:13
相关论文
共 37 条
[1]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[2]   Pathways-based analyses of whole-genome association study data in bipolar disorder reveal genes mediating ion channel activity and synaptic neurotransmission [J].
Askland, Kathleen ;
Read, Cynthia ;
Moore, Jason .
HUMAN GENETICS, 2009, 125 (01) :63-79
[3]   Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts [J].
Aulchenko, Yurii S. ;
Ripatti, Samuli ;
Lindqvist, Ida ;
Boomsma, Dorret ;
Heid, Iris M. ;
Pramstaller, Peter P. ;
Penninx, Brenda W. J. H. ;
Janssens, A. Cecile J. W. ;
Wilson, James F. ;
Spector, Tim ;
Martin, Nicholas G. ;
Pedersen, Nancy L. ;
Kyvik, Kirsten Ohm ;
Kaprio, Jaakko ;
Hofman, Albert ;
Freimer, Nelson B. ;
Jarvelin, Marjo-Riitta ;
Gyllensten, Ulf ;
Campbell, Harry ;
Rudan, Igor ;
Johansson, Asa ;
Marroni, Fabio ;
Hayward, Caroline ;
Vitart, Veronique ;
Jonasson, Inger ;
Pattaro, Cristian ;
Wright, Alan ;
Hastie, Nick ;
Pichler, Irene ;
Hicks, Andrew A. ;
Falchi, Mario ;
Willemsen, Gonneke ;
Hottenga, Jouke-Jan ;
de Geus, Eco J. C. ;
Montgomery, Grant W. ;
Whitfield, John ;
Magnusson, Patrik ;
Saharinen, Juha ;
Perola, Markus ;
Silander, Kaisa ;
Isaacs, Aaron ;
Sijbrands, Eric J. G. ;
Uitterlinden, Andre G. ;
Witteman, Jacqueline C. M. ;
Oostra, Ben A. ;
Elliott, Paul ;
Ruokonen, Aimo ;
Sabatti, Chiara ;
Gieger, Christian ;
Meitinger, Thomas .
NATURE GENETICS, 2009, 41 (01) :47-55
[4]   Genome-wide association analysis of susceptibility and clinical phenotype in multiple sclerosis [J].
Baranzini, Sergio E. ;
Wang, Joanne ;
Gibson, Rachel A. ;
Galwey, Nicholas ;
Naegelin, Yvonne ;
Barkhof, Frederik ;
Radue, Ernst-Wilhelm ;
Lindberg, Raija L. P. ;
Uitdehaag, Bernard M. G. ;
Johnson, Michael R. ;
Angelakopoulou, Aspasia ;
Hall, Leslie ;
Richardson, Jill C. ;
Prinjha, Rab K. ;
Gass, Achim ;
Geurts, Jeroen J. G. ;
Kragt, Jolijn ;
Sombekke, Madeleine ;
Vrenken, Hugo ;
Qualley, Pamela ;
Lincoln, Robin R. ;
Gomez, Refujia ;
Caillier, Stacy J. ;
George, Michaela F. ;
Mousavi, Hourieh ;
Guerrero, Rosa ;
Okuda, Darin T. ;
Cree, Bruce A. C. ;
Green, Ari J. ;
Waubant, Emmanuelle ;
Goodin, Douglas S. ;
Pelletier, Daniel ;
Matthews, Paul M. ;
Hauser, Stephen L. ;
Kappos, Ludwig ;
Polman, Chris H. ;
Oksenberg, Jorge R. .
HUMAN MOLECULAR GENETICS, 2009, 18 (04) :767-778
[5]   Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease [J].
Barrett, Jeffrey C. ;
Hansoul, Sarah ;
Nicolae, Dan L. ;
Cho, Judy H. ;
Duerr, Richard H. ;
Rioux, John D. ;
Brant, Steven R. ;
Silverberg, Mark S. ;
Taylor, Kent D. ;
Barmada, M. Michael ;
Bitton, Alain ;
Dassopoulos, Themistocles ;
Datta, Lisa Wu ;
Green, Todd ;
Griffiths, Anne M. ;
Kistner, Emily O. ;
Murtha, Michael T. ;
Regueiro, Miguel D. ;
Rotter, Jerome I. ;
Schumm, L. Philip ;
Steinhart, A. Hillary ;
Targan, Stephan R. ;
Xavier, Ramnik J. ;
Libioulle, Cecile ;
Sandor, Cynthia ;
Lathrop, Mark ;
Belaiche, Jacques ;
Dewit, Olivier ;
Gut, Ivo ;
Heath, Simon ;
Laukens, Debby ;
Mni, Myriam ;
Rutgeerts, Paul ;
Van Gossum, Andre ;
Zelenika, Diana ;
Franchimont, Denis ;
Hugot, Jean-Pierre ;
de Vos, Martine ;
Vermeire, Severine ;
Louis, Edouard ;
Cardon, Lon R. ;
Anderson, Carl A. ;
Drummond, Hazel ;
Nimmo, Elaine ;
Ahmad, Tariq ;
Prescott, Natalie J. ;
Onnie, Clive M. ;
Fisher, Sheila A. ;
Marchini, Jonathan ;
Ghori, Jilur .
NATURE GENETICS, 2008, 40 (08) :955-962
[6]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[7]   Dissecting complex disease: the quest for the Philosopher's Stone? [J].
Buchanan, Anne V. ;
Weiss, Kenneth M. ;
Fullerton, Stephanie M. .
INTERNATIONAL JOURNAL OF EPIDEMIOLOGY, 2006, 35 (03) :562-571
[8]   GenoWatch: a disease gene mining browser for association study [J].
Chen, Yan-Hau ;
Liu, Chuan-Kun ;
Chang, Shu-Chuan ;
Lin, Yi-Jung ;
Tsai, Ming-Fang ;
Chen, Yuan-Tsong ;
Yao, Adam .
NUCLEIC ACIDS RESEARCH, 2008, 36 :W336-W340
[9]   Angiotensin-1-converting enzyme (ACE) plasma concentration is influenced by multiple ACE-linked quantitative trait nucleotides [J].
Cox, R ;
Bouzekri, N ;
Martin, S ;
Southam, L ;
Hugill, A ;
Golamaully, M ;
Cooper, R ;
Adeyemo, A ;
Soubrier, F ;
Ward, R ;
Lathrop, GM ;
Matsuda, F ;
Farrall, M .
HUMAN MOLECULAR GENETICS, 2002, 11 (23) :2969-2977
[10]   A genome-wide association study of global gene expression [J].
Dixon, Anna L. ;
Liang, Liming ;
Moffatt, Miriam F. ;
Chen, Wei ;
Heath, Simon ;
Wong, Kenny C. C. ;
Taylor, Jenny ;
Burnett, Edward ;
Gut, Ivo ;
Farrall, Martin ;
Lathrop, G. Mark ;
Abecasis, Goncalo R. ;
Cookson, William O. C. .
NATURE GENETICS, 2007, 39 (10) :1202-1207