Methodological Issues in Multistage Genome-Wide Association Studies

被引:36
作者
Thomas, Duncan C. [1 ]
Casey, Graham [1 ]
Conti, David V. [1 ]
Haile, Robert W. [1 ]
Lewinger, Juan Pablo [1 ]
Stram, Daniel O. [1 ]
机构
[1] Univ So Calif, Dept Prevent Med, Los Angeles, CA 90089 USA
关键词
Multistage sampling; genetic associations; replication; resequencing; DNA pooling; gene-environment interactions; GENE-ENVIRONMENT INDEPENDENCE; 2-STAGE DESIGNS; POOLED DNA; RARE VARIANTS; LINKAGE DISEQUILIBRIUM; DETECTING ASSOCIATIONS; IMPROVING POWER; COMMON DISEASES; COMPLEX DISEASE; FALSE DISCOVERY;
D O I
10.1214/09-STS288
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Because of the high cost of commercial genotyping chip technologies, many investigations have used a two-stage design for genome-wide association studies, using part of the sample for an initial discovery of "promising" SNPs at a less stringent significance level and the remainder in a joint analysis of just these SNPs using custom genotyping. Typical cost savings of about 50% are possible with this design to obtain comparable levels of overall type I error and power by using about half the sample for stage I and carrying about 0.1% of SNPs forward to the second stage, the optimal design depending primarily upon the ratio of costs per genotype for stages I and II. However, with the rapidly declining costs of the commercial panels, the generally low observed ORs of current studies, and many studies aiming to test multiple hypotheses and multiple endpoints, many investigators are abandoning the two-stage design in favor of simply genotyping all available subjects using a standard high-density panel. Concern is sometimes raised about the absence of a "replication" panel in this approach, as required by some high-profile journals, but it must be appreciated that the two-stage design is not a discovery/replication design but simply a more efficient design for discovery using, a joint analysis of the data from both stages. Once a subset of highly-significant associations has been discovered, a truly independent "exact replication" study is needed in a similar population of the same promising SNPs using similar methods. This can then be followed by (1) "generalizability" studies to assess the full scope of replicated associations across different races, different endpoints, different interactions, etc.; (2) fine-mapping or resequencing to try to identify the causal variant; and (3) experimental studies of the biological function of these genes. Multistage sampling designs may be more useful at this stage, say, for selecting subsets of subjects for deep resequencing of regions identified in the GWAS.
引用
收藏
页码:414 / 429
页数:16
相关论文
共 110 条
[1]   Limitations of the case-only design for identifying gene-environment interactions [J].
Albert, PS ;
Ratnasinghe, D ;
Tangrea, J ;
Wacholder, S .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2001, 154 (08) :687-693
[2]   Genetic Mapping in Human Disease [J].
Altshuler, David ;
Daly, Mark J. ;
Lander, Eric S. .
SCIENCE, 2008, 322 (5903) :881-888
[3]   Evaluating the effects of imputation on the power, coverage, and cost efficiency of genome-wide SNP platforms [J].
Anderson, Carl A. ;
Pettersson, Fredrik H. ;
Barrett, Jeffrey C. ;
Zhuang, Joanna J. ;
Ragoussis, Jiannis ;
Cardon, Lon R. ;
Morris, Andrew P. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2008, 83 (01) :112-119
[4]  
[Anonymous], 1999, NAT GENET, V22, P1
[5]   Population Structure and Cryptic Relatedness in Genetic Association Studies [J].
Astle, William ;
Balding, David J. .
STATISTICAL SCIENCE, 2009, 24 (04) :451-471
[6]   Association testing by DNA pooling: An effective initial screen [J].
Bansal, A ;
van den Boom, D ;
Kammerer, S ;
Honisch, C ;
Adam, G ;
Cantor, CR ;
Kleyn, P ;
Braun, A .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (26) :16871-16874
[7]  
Barratt BJ, 2002, ANN HUM GENET, V66, P393, DOI [10.1046/j.1469-1809.2002.00125.x, 10.1017/S0003480002001252]
[8]   Evaluating coverage of genome-wide association studies [J].
Barrett, Jeffrey C. ;
Cardon, Lon R. .
NATURE GENETICS, 2006, 38 (06) :659-662
[9]   Design and analysis of two-phase studies with binary outcome applied to Wilms tumour prognosis [J].
Breslow, NE ;
Chatterjee, N .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1999, 48 :457-468
[10]   Replicating genotype-phenotype associations [J].
Chanock, Stephen J. ;
Manolio, Teri ;
Boehnke, Michael ;
Boerwinkle, Eric ;
Hunter, David J. ;
Thomas, Gilles ;
Hirschhorn, Joel N. ;
Abecasis, Goncalo ;
Altshuler, David ;
Bailey-Wilson, Joan E. ;
Brooks, Lisa D. ;
Cardon, Lon R. ;
Daly, Mark ;
Donnelly, Peter ;
Fraumeni, Joseph F., Jr. ;
Freimer, Nelson B. ;
Gerhard, Daniela S. ;
Gunter, Chris ;
Guttmacher, Alan E. ;
Guyer, Mark S. ;
Harris, Emily L. ;
Hoh, Josephine ;
Hoover, Robert ;
Kong, C. Augustine ;
Merikangas, Kathleen R. ;
Morton, Cynthia C. ;
Palmer, Lyle J. ;
Phimister, Elizabeth G. ;
Rice, John P. ;
Roberts, Jerry ;
Rotimi, Charles ;
Tucker, Margaret A. ;
Vogan, Kyle J. ;
Wacholder, Sholom ;
Wijsman, Ellen M. ;
Winn, Deborah M. ;
Collins, Francis S. .
NATURE, 2007, 447 (7145) :655-660