Practical Issues in Imputation-Based Association Mapping

被引:125
作者
Guan, Yongtao [1 ,2 ]
Stephens, Matthew [1 ,2 ]
机构
[1] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
[2] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
来源
PLOS GENETICS | 2008年 / 4卷 / 12期
关键词
D O I
10.1371/journal.pgen.1000279
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Imputation-based association methods provide a powerful framework for testing untyped variants for association with phenotypes and for combining results from multiple studies that use different genotyping platforms. Here, we consider several issues that arise when applying these methods in practice, including: (i) factors affecting imputation accuracy, including choice of reference panel; (ii) the effects of imputation accuracy on power to detect associations; (iii) the relative merits of Bayesian and frequentist approaches to testing imputed genotypes for association with phenotype; and (iv) how to quickly and accurately compute Bayes factors for testing imputed SNPs. We find that imputation-based methods can be robust to imputation accuracy and can improve power to detect associations, even when average imputation accuracy is poor. We explain how ranking SNPs for association by a standard likelihood ratio test gives the same results as a Bayesian procedure that uses an unnatural prior assumption-specifically, that difficult-to-impute SNPs tend to have larger effects and assess the power gained from using a Bayesian approach that does not make this assumption. Within the Bayesian framework, we find that good approximations to a full analysis can be achieved by simply replacing unknown genotypes with a point estimate-their posterior mean. This approximation considerably reduces computational expense compared with published sampling-based approaches, and the methods we present are practical on a genome-wide scale with very modest computational resources (e. g., a single desktop computer). The approximation also facilitates combining information across studies, using only summary data for each SNP. Methods discussed here are implemented in the software package BIMBAM, which is available from http://stephenslab.uchicago.edu/software.html.
引用
收藏
页数:11
相关论文
共 22 条
  • [1] Effect of statin therapy on C-reactive protein levels - The Pravastatin Inflammation/CRP Evaluation (PRINCE): A randomized trial and cohort study
    Albert, MA
    Danielson, E
    Rifai, N
    Ridker, PM
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2001, 286 (01): : 64 - 70
  • [2] Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering
    Browning, Sharon R.
    Browning, Brian L.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) : 1084 - 1097
  • [3] Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
    Burton, Paul R.
    Clayton, David G.
    Cardon, Lon R.
    Craddock, Nick
    Deloukas, Panos
    Duncanson, Audrey
    Kwiatkowski, Dominic P.
    McCarthy, Mark I.
    Ouwehand, Willem H.
    Samani, Nilesh J.
    Todd, John A.
    Donnelly, Peter
    Barrett, Jeffrey C.
    Davison, Dan
    Easton, Doug
    Evans, David
    Leung, Hin-Tak
    Marchini, Jonathan L.
    Morris, Andrew P.
    Spencer, Chris C. A.
    Tobin, Martin D.
    Attwood, Antony P.
    Boorman, James P.
    Cant, Barbara
    Everson, Ursula
    Hussey, Judith M.
    Jolley, Jennifer D.
    Knight, Alexandra S.
    Koch, Kerstin
    Meech, Elizabeth
    Nutland, Sarah
    Prowse, Christopher V.
    Stevens, Helen E.
    Taylor, Niall C.
    Walters, Graham R.
    Walker, Neil M.
    Watkins, Nicholas A.
    Winzer, Thilo
    Jones, Richard W.
    McArdle, Wendy L.
    Ring, Susan M.
    Strachan, David P.
    Pembrey, Marcus
    Breen, Gerome
    St Clair, David
    Caesar, Sian
    Gordon-Smith, Katherine
    Jones, Lisa
    Fraser, Christine
    Green, Elain K.
    [J]. NATURE, 2007, 447 (7145) : 661 - 678
  • [4] Low LDL cholesterol in African Americans resulting from frequent nonsense mutations in PCSK9
    Cohen, J
    Pertsemlidis, A
    Kotowski, IK
    Graham, R
    Garcia, CK
    Hobbs, HH
    [J]. NATURE GENETICS, 2005, 37 (03) : 328 - 328
  • [5] Cox D. R., 1989, Analysis of Binary Data, V2nd
  • [6] Imputation methods to improve inference in SNP association studies
    Dai, James Y.
    Ruczinski, Ingo
    LeBlanc, Michael
    Kooperberg, Charles
    [J]. GENETIC EPIDEMIOLOGY, 2006, 30 (08) : 690 - 702
  • [7] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3
  • [8] Li N, 2003, GENETICS, V165, P2213
  • [9] Simple and efficient analysis of disease association with missing genotype data
    Lin, D. Y.
    Hu, Y.
    Huang, Be
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2008, 82 (02) : 444 - 452
  • [10] A new multipoint method for genome-wide association studies by imputation of genotypes
    Marchini, Jonathan
    Howie, Bryan
    Myers, Simon
    McVean, Gil
    Donnelly, Peter
    [J]. NATURE GENETICS, 2007, 39 (07) : 906 - 913