Entropy-based joint analysis for two-stage genome-wide association studies

被引:10
作者
Kang, Guolian [1 ]
Zuo, Yijun [1 ]
机构
[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
关键词
complex diseases; entropy; false discovery rate; genetic variants;
D O I
10.1007/s10038-007-0177-7
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) are being conducted to identify common genetic variants that predispose to human diseases to unravel the genetic etiology of complex human diseases now. Because of genotyping cost constraints, it often follows a two-stage design, in which a large number of markers are identified in a proportion of the available samples in stage 1, and then the markers identified in stage 1 are examined in all the samples in stage 2. In this paper, we introduce a nonlinear entropy-based statistic for joint analysis for two-stage genome-wide association studies. Type I error rates and power of the entropy-based statistic for association tests are validated using simulation studies in single-locus test. The power of entropy-based joint analysis is investigated by simulations. And the results suggest that entropy-based joint analysis is always more powerful than linear joint analysis that uses a linear function of risk allele frequencies in cases and controls when detecting rare genetic variants; the powers of these two joint analyses are comparable when detecting common genetic variants. Furthermore, when the false discovery rate is controlled, entropy-based joint analysis is more powerful and needs fewer samples than linear joint analysis that uses a linear function of risk allele frequencies in cases and controls. So, we recommend we should use entropy-based strategy for two-stage genome-wide association studies to detect the rare and common genetic variants with moderate to large genetic effect underlying a complex disease.
引用
收藏
页码:747 / 756
页数:10
相关论文
共 23 条
[1]   Haplotypic analysis of the TNF locus by association efficiency and entropy -: art. no. R24 [J].
Ackerman, H ;
Usen, S ;
Mott, R ;
Richardson, A ;
Sisay-Joof, F ;
Katundu, P ;
Taylor, T ;
Ward, R ;
Molyneux, M ;
Pinder, M ;
Kwiatkowski, DP .
GENOME BIOLOGY, 2003, 4 (04)
[2]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[3]  
Cover TM., 2006, Elements of information theory, DOI [10.1002/047174882X.ch2,arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/047174882X.ch2, DOI 10.1002/047174882X]
[4]  
GREINER W, 1995, THERMODYNAMICS STAT, P121
[5]   Entropy-based SNP selection for genetic association studies [J].
Hampe, J ;
Schreiber, S ;
Krawczak, M .
HUMAN GENETICS, 2003, 114 (01) :36-43
[6]   Whole-genome patterns of common DNA variation in three human populations [J].
Hinds, DA ;
Stuve, LL ;
Nilsen, GB ;
Halperin, E ;
Eskin, E ;
Ballinger, DG ;
Frazer, KA ;
Cox, DR .
SCIENCE, 2005, 307 (5712) :1072-1079
[7]   Complement factor H polymorphism in age-related macular degeneration [J].
Klein, RJ ;
Zeiss, C ;
Chew, EY ;
Tsai, JY ;
Sackler, RS ;
Haynes, C ;
Henning, AK ;
SanGiovanni, JP ;
Mane, SM ;
Mayne, ST ;
Bracken, MB ;
Ferris, FL ;
Ott, J ;
Barnstable, C ;
Hoh, J .
SCIENCE, 2005, 308 (5720) :385-389
[8]  
LEHMAN EL, 1983, THEORY POINT ESTIMAT, P343
[9]   Evaluating statistical significance in two-stage genomewide association studies [J].
Lin, DY .
AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 78 (03) :505-509
[10]   Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies [J].
Lin, S ;
Chakravarti, A ;
Cutler, DJ .
NATURE GENETICS, 2004, 36 (11) :1181-1188