Stratified false discovery control for large-scale hypothesis testing with application to genome-wide association studies

被引:112
作者
Sun, Lei
Craiu, Radu V.
Paterson, Andrew D.
Bull, Shelley B.
机构
[1] Univ Toronto, Dept Publ Hlth Sci, Toronto, ON M5T 3M7, Canada
[2] Hosp Sick Children, Program Genet & Genom Biol, Toronto, ON M5G 1X8, Canada
[3] Univ Toronto, Dept Stat, Toronto, ON, Canada
[4] Mt Sinai Hosp, Samuel Lunenfeld Res Inst, Toronto, ON M5G 1X5, Canada
基金
加拿大健康研究院;
关键词
multiple comparisons; genome-scans; type I error; type II error; power; false discovery rate (FDR); stratified FDR;
D O I
10.1002/gepi.20164
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The multiplicity problem has become increasingly important in genetic studies as the capacity for high-throughput genotyping has increased. The control of False Discovery Rate (FDR) (Benjamini and Hochberg. [1995] J. R. Stat. Soc. Ser. B 57:289-300) has been adopted to address the problems of false positive control and low power inherent in high-volume genome-wide linkage and association studies. In many genetic studies, there is often a natural stratification of the m hypotheses to be tested. Given the FDR framework and the presence of such stratification, we investigate the performance of a stratified false discovery control approach (i.e. control or estimate FDR separately for each stratum) and compare it to the aggregated method (i.e. consider all hypotheses in a single stratum). Under the fixed rejection region framework (i.e. reject all hypotheses with unadjusted p-values less than a pre-specified level and then estimate FDR), we demonstrate that the aggregated FDR is a weighted average of the stratum-specific FDRs. Under the fixed FDR framework (i.e. reject as many hypotheses as possible and meanwhile control FDR at a pre-specified level), we specify a condition necessary for the expected total number of true positives under the stratified FDR method to be equal to or greater than that obtained from the aggregated FDR method. Application to a recent Genome-Wide Association (GWA) study by Maraganore et al. ([2005] Am. J. Hum. Genet. 77:685-693) illustrates the potential advantages of control or estimation of FDR by stratum. Our analyses also show that controlling FDR at a low rate, e.g. 5% or 10%, may not be feasible for some GWA studies. Genet. Epidemiol. 30:519-530, 2006. (c) 2006 Wiley-Liss, Inc.
引用
收藏
页码:519 / 530
页数:12
相关论文
共 22 条
[1]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[2]   Quantitative trait loci analysis using the false discovery rate [J].
Benjamini, Y ;
Yekutieli, D .
GENETICS, 2005, 171 (02) :783-789
[3]   On the adaptive control of the false discovery fate in multiple testing with independent statistics [J].
Benjamini, Y ;
Hochberg, Y .
JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2000, 25 (01) :60-83
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   Genetic variation at the ACE gene is associated with persistent microalbuminuria and severe nephropathy in type 1 diabetes [J].
Boright, AP ;
Paterson, AD ;
Mirea, L ;
Bull, SB ;
Mowjoodi, A ;
Scherer, SW ;
Zinman, B .
DIABETES, 2005, 54 (04) :1238-1244
[6]  
CRAIU RV, 2006, IN PRESS STAT SINICA
[7]   Large-scale simultaneous hypothesis testing: The choice of a null hypothesis [J].
Efron, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (465) :96-104
[8]  
EFRON B, 2005, CORRELATION LARGE SC
[9]   Operating characteristics and extensions of the false discovery rate procedure [J].
Genovese, C ;
Wasserman, L .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :499-517
[10]  
GENOVESE C, 2001, LARGE SAMPLE APPROAC