Using the Gene Ontology to Scan Multilevel Gene Sets for Associations in Genome Wide Association Studies

被引:28
作者
Schaid, Daniel J. [1 ]
Sinnwell, Jason P. [1 ]
Jenkins, Gregory D. [1 ]
McDonnell, Shannon K. [1 ]
Ingle, James N. [2 ]
Kubo, Michiaki [4 ]
Goss, Paul E. [5 ]
Costantino, Joseph P. [6 ]
Wickerham, D. Lawrence [7 ]
Weinshilboum, Richard M. [3 ]
机构
[1] Mayo Clin, Dept Hlth Sci Res, Div Biomed Stat & Informat, Rochester, MN 55905 USA
[2] Mayo Clin, Dept Oncol, Div Med Oncol, Rochester, MN 55905 USA
[3] Mayo Clin, Dept Mol Pharmacol & Expt Therapeut, Div Clin Pharmacol, Rochester, MN 55905 USA
[4] RIKEN Ctr Genom Med, Tokyo, Japan
[5] Harvard Univ, Massachusetts Gen Hosp, Ctr Canc, Boston, MA 02114 USA
[6] Univ Pittsburgh, Dept Biostat, Pittsburgh, PA 15261 USA
[7] Allegheny Gen Hosp, Sect Canc Genet & Prevent, Pittsburgh, PA 15212 USA
基金
美国国家卫生研究院;
关键词
gene-sets; genome wide association; pathways; score statistics; PATHWAY ANALYSIS; STATISTICAL-METHODS; BREAST-CANCER; MULTIPLE SNPS; REGRESSION; HERITABILITY; TAMOXIFEN; TESTS;
D O I
10.1002/gepi.20632
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes.'' To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. Genet. Epidemiol. 36 : 3-16, 2012. (C) 2011 Wiley Periodicals, Inc.
引用
收藏
页码:3 / 16
页数:14
相关论文
共 58 条
[1]   A general modular framework for gene set enrichment analysis [J].
Ackermann, Marit ;
Strimmer, Korbinian .
BMC BIOINFORMATICS, 2009, 10
[2]  
[Anonymous], BOOST GRAPH LIB
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   Comparisons of Multi-Marker Association Methods to Detect Association Between a Candidate Region and Disease [J].
Ballard, David H. ;
Cho, Judy ;
Zhao, Hongyu .
GENETIC EPIDEMIOLOGY, 2010, 34 (03) :201-212
[5]   The Gene Ontology in 2010: extensions and refinements The Gene Ontology Consortium [J].
Berardini, Tanya Z. ;
Li, Donghui ;
Huala, Eva ;
Bridges, Susan ;
Burgess, Shane ;
McCarthy, Fiona ;
Carbon, Seth ;
Lewis, Suzanna E. ;
Mungall, Christopher J. ;
Abdulla, Amina ;
Wood, Valerie ;
Feltrin, Erika ;
Valle, Giorgio ;
Chisholm, Rex L. ;
Fey, Petra ;
Gaudet, Pascale ;
Kibbe, Warren ;
Basu, Siddhartha ;
Bushmanova, Yulia ;
Eilbeck, Karen ;
Siegele, Deborah A. ;
McIntosh, Brenley ;
Renfro, Daniel ;
Zweifel, Adrienne ;
Hu, James C. ;
Ashburner, Michael ;
Tweedie, Susan ;
Alam-Faruque, Yasmin ;
Apweiler, Rolf ;
Auchinchloss, Andrea ;
Bairoch, Amos ;
Barrell, Daniel ;
Binns, David ;
Blatter, Marie-Claude ;
Bougueleret, Lydie ;
Boutet, Emmanuel ;
Breuza, Lionel ;
Bridge, Alan ;
Browne, Paul ;
Chan, Wei Mun ;
Coudert, Elizabeth ;
Daugherty, Louise ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Estreicher, Anne ;
Famiglietti, Livia ;
Ferro-Rojas, Serenella ;
Feuermann, Marc ;
Foulger, Rebecca ;
Gruaz-Gumowski, Nadine .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D331-D335
[6]   Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application [J].
Cantor, Rita M. ;
Lange, Kenneth ;
Sinsheimer, Janet S. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2010, 86 (01) :6-22
[7]   Analysis of multiple SNPs in a candidate gene or region [J].
Chapman, Juliet ;
Whittaker, John .
GENETIC EPIDEMIOLOGY, 2008, 32 (06) :560-566
[8]   On the Utility of Gene Set Methods in Genomewide Association Studies of Quantitative Traits [J].
Chasman, Daniel I. .
GENETIC EPIDEMIOLOGY, 2008, 32 (07) :658-668
[9]   Pathway-Based Analysis for Genome-Wide Association Studies Using Supervised Principal Components [J].
Chen, Xi ;
Wang, Lily ;
Hu, Bo ;
Guo, Mingsheng ;
Barnard, John ;
Zhu, Xiaofeng .
GENETIC EPIDEMIOLOGY, 2010, 34 (07) :716-724
[10]   Variations in DNA elucidate molecular networks that cause disease [J].
Chen, Yanqing ;
Zhu, Jun ;
Lum, Pek Yee ;
Yang, Xia ;
Pinto, Shirly ;
MacNeil, Douglas J. ;
Zhang, Chunsheng ;
Lamb, John ;
Edwards, Stephen ;
Sieberts, Solveig K. ;
Leonardson, Amy ;
Castellini, Lawrence W. ;
Wang, Susanna ;
Champy, Marie-France ;
Zhang, Bin ;
Emilsson, Valur ;
Doss, Sudheer ;
Ghazalpour, Anatole ;
Horvath, Steve ;
Drake, Thomas A. ;
Lusis, Aldons J. ;
Schadt, Eric E. .
NATURE, 2008, 452 (7186) :429-435