Stability Selection for Genome-Wide Association

被引:49
作者
Alexander, David H. [1 ]
Lange, Kenneth [2 ,3 ,4 ]
机构
[1] Univ Calif Los Angeles, David Geffen Sch Med, Dept Biomath, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Biomath, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[4] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
关键词
genome-wide association; variable selection; stability selection; the lasso; Wellcome Trust Case Control Consortium data; VARIABLE SELECTION; LASSO; LOCI; RISK;
D O I
10.1002/gepi.20623
中图分类号
Q3 [遗传学];
学科分类号
071007 [遗传学];
摘要
This article applies the recently proposed "stability selection'' procedure of Meinshausen and Buhlmann to the problem of variable selection in genome-wide association. In particular, it explores whether stability selection can identify new regions of interest originally missed or can call into legitimate question regions originally flagged. Our analysis of the seven data sets of the Wellcome Trust Case-Control Consortium suggests that stability selection effectively controls the family-wise error rate but suffers a loss of power. The extensive correlation structure among SNP markers induced by linkage disequilibrium renders the procedure too conservative, causing it to miss regions known to be highly significant from simple marginal analyses. As a remedy one can aggregate nearby SNPs into groups and select groups rather than individual SNPs. The modified procedure can accurately identify the most important regions of genome-wide association, but in a simulation study it still offers less power than simpler and less computationally intensive methods of marginal association testing. Genet. Epidemiol. 35:722-728, 2011. (C) 2011 Wiley Periodicals, Inc.
引用
收藏
页码:722 / 728
页数:7
相关论文
共 25 条
[1]
A tutorial on statistical methods for population association studies [J].
Balding, David J. .
NATURE REVIEWS GENETICS, 2006, 7 (10) :781-791
[2]
Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease [J].
Barrett, Jeffrey C. ;
Hansoul, Sarah ;
Nicolae, Dan L. ;
Cho, Judy H. ;
Duerr, Richard H. ;
Rioux, John D. ;
Brant, Steven R. ;
Silverberg, Mark S. ;
Taylor, Kent D. ;
Barmada, M. Michael ;
Bitton, Alain ;
Dassopoulos, Themistocles ;
Datta, Lisa Wu ;
Green, Todd ;
Griffiths, Anne M. ;
Kistner, Emily O. ;
Murtha, Michael T. ;
Regueiro, Miguel D. ;
Rotter, Jerome I. ;
Schumm, L. Philip ;
Steinhart, A. Hillary ;
Targan, Stephan R. ;
Xavier, Ramnik J. ;
Libioulle, Cecile ;
Sandor, Cynthia ;
Lathrop, Mark ;
Belaiche, Jacques ;
Dewit, Olivier ;
Gut, Ivo ;
Heath, Simon ;
Laukens, Debby ;
Mni, Myriam ;
Rutgeerts, Paul ;
Van Gossum, Andre ;
Zelenika, Diana ;
Franchimont, Denis ;
Hugot, Jean-Pierre ;
de Vos, Martine ;
Vermeire, Severine ;
Louis, Edouard ;
Cardon, Lon R. ;
Anderson, Carl A. ;
Drummond, Hazel ;
Nimmo, Elaine ;
Ahmad, Tariq ;
Prescott, Natalie J. ;
Onnie, Clive M. ;
Fisher, Sheila A. ;
Marchini, Jonathan ;
Ghori, Jilur .
NATURE GENETICS, 2008, 40 (08) :955-962
[3]
Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes [J].
Barrett, Jeffrey C. ;
Clayton, David G. ;
Concannon, Patrick ;
Akolkar, Beena ;
Cooper, Jason D. ;
Erlich, Henry A. ;
Julier, Cecile ;
Morahan, Grant ;
Nerup, Jorn ;
Nierras, Concepcion ;
Plagnol, Vincent ;
Pociot, Flemming ;
Schuilenburg, Helen ;
Smyth, Deborah J. ;
Stevens, Helen ;
Todd, John A. ;
Walker, Neil M. ;
Rich, Stephen S. .
NATURE GENETICS, 2009, 41 (06) :703-707
[4]
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[5]
OpenMP: An industry standard API for shared-memory programming [J].
Dagum, L ;
Menon, R .
IEEE COMPUTATIONAL SCIENCE & ENGINEERING, 1998, 5 (01) :46-55
[6]
Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[7]
Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[8]
Friedman J., 2008, REGULARIZATION PATHS
[9]
Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies [J].
Hoggart, Clive J. ;
Whittaker, John C. ;
De Iorio, Maria ;
Balding, David J. .
PLOS GENETICS, 2008, 4 (07)
[10]
The UCSC Known Genes [J].
Hsu, F ;
Kent, WJ ;
Clawson, H ;
Kuhn, RM ;
Diekhans, M ;
Haussler, D .
BIOINFORMATICS, 2006, 22 (09) :1036-1046