Powerful SNP-Set Analysis for Case-Control Genome-wide Association Studies

被引:456
作者
Wu, Michael C. [2 ]
Kraft, Peter [1 ,3 ]
Epstein, Michael P. [4 ]
Taylor, Deanne M. [1 ]
Chanock, Stephen J. [5 ]
Hunter, David J. [3 ]
Lin, Xihong [1 ]
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[2] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[3] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[4] Emory Univ, Dept Human Genet, Atlanta, GA 30322 USA
[5] NCI, Div Canc Epidemiol & Genet, Bethesda, MD 20892 USA
基金
美国国家卫生研究院;
关键词
MULTIPLE-TESTING CORRECTION; SCORE TESTS; REGRESSION; LOCI; RISK; HAPLOTYPES; SIMILARITY; TRAITS; MODELS; GENES;
D O I
10.1016/j.ajhg.2010.05.002
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
GWAS have emerged as popular tools for identifying genetic variants that are associated with disease risk. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. However, this approach suffers from limited reproducibility and difficulties in detecting multi-SNP and epistatic effects. As an alternative analytical strategy, we propose grouping SNPs together into SNP sets on the basis of proximity to genomic features such as genes or haplotype blocks, then testing the joint effect of each SNP set. Testing of each SNP set proceeds via the logistic kernel-machine-based test, which is based on a statistical framework that allows for flexible modeling of epistatic and nonlinear SNP effects. This flexibility and the ability to naturally adjust for covariate effects are important features of our test that make it appealing in comparison to individual SNP tests and existing multimarker tests. Using simulated data based on the International HapMap Project, we show that SNP-set testing can have improved power over standard individual-SNP analysis under a wide range of settings. In particular, we find that our approach has higher power than individual-SNP analysis when the median correlation between the disease-susceptibility variant and the genotyped SNPs is moderate to high. When the correlation is low, both individual-SNP analysis and the SNP-set analysis tend to have low power. We apply SNP-set analysis to analyze the Cancer Genetic Markers of Susceptibility (CGEMS) breast cancer GWAS discovery-phase data.
引用
收藏
页码:929 / 942
页数:14
相关论文
共 49 条
[1]   A haplotype map of the human genome [J].
Altshuler, D ;
Brooks, LD ;
Chakravarti, A ;
Collins, FS ;
Daly, MJ ;
Donnelly, P ;
Gibbs, RA ;
Belmont, JW ;
Boudreau, A ;
Leal, SM ;
Hardenbol, P ;
Pasternak, S ;
Wheeler, DA ;
Willis, TD ;
Yu, FL ;
Yang, HM ;
Zeng, CQ ;
Gao, Y ;
Hu, HR ;
Hu, WT ;
Li, CH ;
Lin, W ;
Liu, SQ ;
Pan, H ;
Tang, XL ;
Wang, J ;
Wang, W ;
Yu, J ;
Zhang, B ;
Zhang, QR ;
Zhao, HB ;
Zhao, H ;
Zhou, J ;
Gabriel, SB ;
Barry, R ;
Blumenstiel, B ;
Camargo, A ;
Defelice, M ;
Faggart, M ;
Goyette, M ;
Gupta, S ;
Moore, J ;
Nguyen, H ;
Onofrio, RC ;
Parkin, M ;
Roy, J ;
Stahl, E ;
Winchester, E ;
Ziaugra, L ;
Shen, Y .
NATURE, 2005, 437 (7063) :1299-1320
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Haploview: analysis and visualization of LD and haplotype maps [J].
Barrett, JC ;
Fry, B ;
Maller, J ;
Daly, MJ .
BIOINFORMATICS, 2005, 21 (02) :263-265
[4]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[5]   Detecting disease associations due to linkage disequilibrium using haplotype tags: A class of tests and the determinants of statistical power [J].
Chapman, JM ;
Cooper, JD ;
Todd, JA ;
Clayton, DG .
HUMAN HEREDITY, 2003, 56 (1-3) :18-31
[6]   A simple correction for multiple comparisons in interval mapping genome scans [J].
Cheverud, JM .
HEREDITY, 2001, 87 (1) :52-58
[7]  
Cristianini M., 2000, INTRO SUPPORT VECTOR
[8]  
Croiseau Pascal, 2009, BMC Proc, V3 Suppl 7, pS61
[9]   Analysis of multilocus models of association [J].
Devlin, B ;
Roeder, K ;
Wasserman, L .
GENETIC EPIDEMIOLOGY, 2003, 25 (01) :36-47
[10]   Genome-wide association study identifies novel breast cancer susceptibility loci [J].
Easton, Douglas F. ;
Pooley, Karen A. ;
Dunning, Alison M. ;
Pharoah, Paul D. P. ;
Thompson, Deborah ;
Ballinger, Dennis G. ;
Struewing, Jeffery P. ;
Morrison, Jonathan ;
Field, Helen ;
Luben, Robert ;
Wareham, Nicholas ;
Ahmed, Shahana ;
Healey, Catherine S. ;
Bowman, Richard ;
Meyer, Kerstin B. ;
Haiman, Christopher A. ;
Kolonel, Laurence K. ;
Henderson, Brian E. ;
Le Marchand, Loic ;
Brennan, Paul ;
Sangrajrang, Suleeporn ;
Gaborieau, Valerie ;
Odefrey, Fabrice ;
Shen, Chen-Yang ;
Wu, Pei-Ei ;
Wang, Hui-Chun ;
Eccles, Diana ;
Evans, D. Gareth ;
Peto, Julian ;
Fletcher, Olivia ;
Johnson, Nichola ;
Seal, Sheila ;
Stratton, Michael R. ;
Rahman, Nazneen ;
Chenevix-Trench, Georgia ;
Bojesen, Stig E. ;
Nordestgaard, Borge G. ;
Axelsson, Christen K. ;
Garcia-Closas, Montserrat ;
Brinton, Louise ;
Chanock, Stephen ;
Lissowska, Jolanta ;
Peplonska, Beata ;
Nevanlinna, Heli ;
Fagerholm, Rainer ;
Eerola, Hannaleena ;
Kang, Daehee ;
Yoo, Keun-Young ;
Noh, Dong-Young ;
Ahn, Sei-Hyun .
NATURE, 2007, 447 (7148) :1087-U7