A Simple and Fast Two-Locus Quality Control Test to Detect False Positives Due to Batch Effects in Genome-Wide Association Studies

被引:10
作者
Lee, Sang Hong [1 ]
Nyholt, Dale R. [1 ]
Macgregor, Stuart [1 ]
Henders, Anjali K. [1 ]
Zondervan, Krina T. [2 ]
Montgomery, Grant W. [1 ]
Visscher, Peter M. [1 ]
机构
[1] Queensland Inst Med Res, Herston, Qld 4006, Australia
[2] Univ Oxford, John Radcliffe Hosp, Nuffield Dept Obstet & Gynaecol, Oxford OX3 9DU, England
基金
英国惠康基金;
关键词
genome-wide association study; batch effects; genotyping errors; linear model-based quality control; GENOTYPING ERRORS; LINKAGE; ENDOMETRIOSIS; HERITABILITY; FAMILIES; RISK;
D O I
10.1002/gepi.20541
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two-locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was similar to 80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype-phenotype investigations. Genet. Epidemiol. 34:854-862, 2010. (C) 2010 Wiley-Liss, Inc.
引用
收藏
页码:854 / 862
页数:9
相关论文
共 22 条
[1]   Identification of probable genotyping errors by consideration of haplotypes [J].
Becker, T ;
Valentonyte, R ;
Croucher, PJP ;
Strauch, K ;
Schreiber, S ;
Hampe, J ;
Knapp, M .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2006, 14 (04) :450-458
[2]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[3]   On Quality Control Measures in Genome-Wide Association Studies: A Test to Assess the Genotyping Quality of Individual Probands in Family-Based Association Studies and an Application to the HapMap Data [J].
Fardo, David W. ;
Ionita-Laza, Iuliana ;
Lange, Christoph .
PLOS GENETICS, 2009, 5 (07)
[4]   Quantitative Trait Loci for CD4:CD8 Lymphocyte Ratio Are Associated with Risk of Type 1 Diabetes and HIV-1 Immune Control [J].
Ferreira, Manuel A. R. ;
Mangino, Massimo ;
Brumme, Chanson J. ;
Zhao, Zhen Zhen ;
Medland, Sarah E. ;
Wright, Margaret J. ;
Nyholt, Dale R. ;
Gordon, Scott ;
Campbell, Megan ;
McEvoy, Brian P. ;
Henders, Anjali ;
Evans, David M. ;
Lanchbury, Jerry S. ;
Pereyra, Florencia ;
Walker, Bruce D. ;
Haas, David W. ;
Soranzo, Nicole ;
Spector, Tim D. ;
de Bakker, Paul I. W. ;
Frazer, Ian H. ;
Montgomery, Grant W. ;
Martin, Nicholas G. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2010, 86 (01) :88-92
[5]   Missing call bias in high-throughput genotyping [J].
Fu, Wenqing ;
Wang, Yi ;
Wang, Ying ;
Li, Rui ;
Lin, Rong ;
Jin, Li .
BMC GENOMICS, 2009, 10
[6]   Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies [J].
Hoggart, Clive J. ;
Whittaker, John C. ;
De Iorio, Maria ;
Balding, David J. .
PLOS GENETICS, 2008, 4 (07)
[7]   Generating samples under a Wright-Fisher neutral model of genetic variation [J].
Hudson, RR .
BIOINFORMATICS, 2002, 18 (02) :337-338
[8]   Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP Data [J].
Lee, Sang Hong ;
van der Werf, Julius H. J. ;
Hayes, Ben J. ;
Goddard, Michael E. ;
Visscher, Peter M. .
PLOS GENETICS, 2008, 4 (10)
[9]   The impact of missing and erroneous genotypes on tagging SNP selection and power of subsequent association tests [J].
Liu, WL ;
Zhao, W ;
Chase, GA .
HUMAN HEREDITY, 2006, 61 (01) :31-44
[10]   Finding the missing heritability of complex diseases [J].
Manolio, Teri A. ;
Collins, Francis S. ;
Cox, Nancy J. ;
Goldstein, David B. ;
Hindorff, Lucia A. ;
Hunter, David J. ;
McCarthy, Mark I. ;
Ramos, Erin M. ;
Cardon, Lon R. ;
Chakravarti, Aravinda ;
Cho, Judy H. ;
Guttmacher, Alan E. ;
Kong, Augustine ;
Kruglyak, Leonid ;
Mardis, Elaine ;
Rotimi, Charles N. ;
Slatkin, Montgomery ;
Valle, David ;
Whittemore, Alice S. ;
Boehnke, Michael ;
Clark, Andrew G. ;
Eichler, Evan E. ;
Gibson, Greg ;
Haines, Jonathan L. ;
Mackay, Trudy F. C. ;
McCarroll, Steven A. ;
Visscher, Peter M. .
NATURE, 2009, 461 (7265) :747-753