Genotyping and inflated type I error rate in genome-wide association case/control studies

被引:4
作者
Sampson, Joshua N. [1 ]
Zhao, Hongyu [1 ]
机构
[1] Yale Univ, Sch Med, Dept Epidemiol & Publ Hlth, New Haven, CT 06510 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
DIFFERENTIAL BIAS; CALLING ALGORITHM; LARGE-SCALE; SNP ARRAYS; MICROARRAYS; BEADARRAY; DISEASE;
D O I
10.1186/1471-2105-10-68
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: One common goal of a case/control genome wide association study (GWAS) is to find SNPs associated with a disease. Traditionally, the first step in such studies is to assign a genotype to each SNP in each subject, based on a statistic summarizing fluorescence measurements. When the distributions of the summary statistics are not well separated by genotype, the act of genotype assignment can lead to more potential problems than acknowledged by the literature. Results: Specifically, we show that the proportions of each called genotype need not equal the true proportions in the population, even as the number of subjects grows infinitely large. The called genotypes for two subjects need not be independent, even when their true genotypes are independent. Consequently, p-values from tests of association can be anti-conservative, even when the distributions of the summary statistic for the cases and controls are identical. To address these problems, we propose two new tests designed to reduce the inflation in the type I error rate caused by these problems. The first algorithm, logiCALL, measures call quality by fully exploring the likelihood profile of intensity measurements, and the second algorithm avoids genotyping by using a likelihood ratio statistic. Conclusion: Genotyping can introduce avoidable false positives in GWAS.
引用
收藏
页数:14
相关论文
共 18 条
[1]  
Affymetrix, BRLMM IMPR GEN CALL
[2]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[3]   Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data [J].
Carvalho, Benilton ;
Bengtsson, Henrik ;
Speed, Terence P. ;
Irizarry, Rafael A. .
BIOSTATISTICS, 2007, 8 (02) :485-499
[4]   Population structure, differential bias and genomic control in a large-scale, case-control association study [J].
Clayton, DG ;
Walker, NM ;
Smyth, DJ ;
Pask, R ;
Cooper, JD ;
Maier, LM ;
Smink, LJ ;
Lam, AC ;
Ovington, NR ;
Stevens, HE ;
Nutland, S ;
Howson, JMM ;
Faham, M ;
Moorhead, M ;
Jones, HB ;
Falkowski, M ;
Hardenbol, P ;
Willis, TD ;
Todd, JA .
NATURE GENETICS, 2005, 37 (11) :1243-1246
[5]   Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays [J].
Di, XJ ;
Matsuzaki, H ;
Webster, TA ;
Hubbell, E ;
Liu, GY ;
Dong, SL ;
Bartell, D ;
Huang, J ;
Chiles, R ;
Yang, G ;
Shen, MM ;
Kulp, D ;
Kennedy, GC ;
Mei, R ;
Jones, KW ;
Cawley, S .
BIOINFORMATICS, 2005, 21 (09) :1958-1963
[6]   A genome-wide association study identifies IL23R as an inflammatory bowel disease gene [J].
Duerr, Richard H. ;
Taylor, Kent D. ;
Brant, Steven R. ;
Rioux, John D. ;
Silverberg, Mark S. ;
Daly, Mark J. ;
Steinhart, A. Hillary ;
Abraham, Clara ;
Regueiro, Miguel ;
Griffiths, Anne ;
Dassopoulos, Themistocles ;
Bitton, Alain ;
Yang, Huiying ;
Targan, Stephan ;
Datta, Lisa Wu ;
Kistner, Emily O. ;
Schumm, L. Philip ;
Lee, Annette T. ;
Gregersen, Peter K. ;
Barmada, M. Michael ;
Rotter, Jerome I. ;
Nicolae, Dan L. ;
Cho, Judy H. .
SCIENCE, 2006, 314 (5804) :1461-1463
[7]   beadarray:: R classes and methods for Illumina bead-based data [J].
Dunning, Mark J. ;
Smith, Mike L. ;
Ritchie, Matthew E. ;
Tavare, Simon .
BIOINFORMATICS, 2007, 23 (16) :2183-2184
[8]   SNiPer-HD: improved genotype calling accuracy by an expectation-maximization algorithm for high-density SNP arrays [J].
Hua, Jianping ;
Craig, David W. ;
Brun, Marcel ;
Webster, Jennifer ;
Zismann, Victoria ;
Tembe, Waibhav ;
Joshipura, Keta ;
Huentelman, Matthew J. ;
Dougherty, Edward R. ;
Stephan, Dietrich A. .
BIOINFORMATICS, 2007, 23 (01) :57-63
[9]   Genotyping Errors and Their Impact on Genetic Analysis [J].
Miller, Michael B. ;
Schwander, Karen ;
Rao, D. C. .
GENETIC DISSECTION OF COMPLEX TRAITS, 2ND EDITION, 2008, 60 :141-152
[10]   Optimal genotype determination in highly multiplexed SNP data [J].
Moorhead, M ;
Hardenbol, P ;
Siddiqui, F ;
Falkowski, M ;
Bruckner, C ;
Ireland, J ;
Jones, HB ;
Jain, M ;
Willis, TD ;
Faham, M .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2006, 14 (02) :207-215