Genotype-Imputation Accuracy across Worldwide Human Populations

被引:181
作者
Huang, Lucy [1 ,2 ]
Li, Yun [1 ]
Singleton, Andrew B. [3 ]
Hardy, John A. [4 ,5 ]
Abecasis, Goncalo [1 ]
Rosenberg, Noah A. [1 ,2 ,6 ]
Scheet, Paul [1 ,7 ]
机构
[1] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Ctr Computat Med & Biol, Ann Arbor, MI 48109 USA
[3] NIA, Neurogenet Lab, NIH, Bethesda, MD 20892 USA
[4] UCL, Dept Mol Neurosci, London WC1N 3BG, England
[5] UCL, Reta Lila Weston Inst Neurol Studies, Inst Neurol, London WC1N 3BG, England
[6] Univ Michigan, Dept Human Genet, Ann Arbor, MI 48109 USA
[7] Univ Texas MD Anderson Canc Ctr, Dept Epidemiol, Houston, TX 77030 USA
基金
美国国家卫生研究院;
关键词
GENOME-WIDE ASSOCIATION; LINKAGE DISEQUILIBRIUM; MISSING GENOTYPES; HAPLOTYPE; INFERENCE; LOCI; TRANSFERABILITY; PATTERNS; RISK;
D O I
10.1016/j.ajhg.2009.01.013
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
A Current approach to mapping complex-disease-susceptibility loci in genome-wide association (GWA) studies involves leveraging the information in a reference database of dense genotype data. By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured In the study samples can be imputed and tested for disease association. This imputation strategy has been successful for GWA studies in populations well represented by existing reference panels. We used genotypes at 513,008 autosomal single-nucleotide polymorphism (SNP) loci in 443 unrelated individuals from 29 worldwide populations to evaluate the "portability" of the HapMap reference panels for imputation in studies of diverse populations. When a single HapMap panel was leveraged for imputation of randomly masked genotypes, European populations had the highest imputation accuracy, followed by populations from East Asia, Central and South Asia, the Americas, Oceania, the Middle East, and Africa. For each population, we identified "optimal" mixtures of reference panels that maximized imputation accuracy, and we found that in most populations, mixtures including individuals from at least two HapMap panels produced the highest imputation accuracy. From a separate survey of additional SNPs typed in the same samples, we evaluated imputation accuracy in the scenario in which all genotypes at a given SNP position were unobserved and were imputed on the basis of data from a commercial "SNP chip," again finding that most populations benefited from the use of combinations of two or more HapMap reference panels. Our results can serve as a guide for selecting appropriate reference panels for imputation-based GWA analysis in diverse populations.
引用
收藏
页码:235 / 250
页数:16
相关论文
共 32 条
  • [1] A haplotype map of the human genome
    Altshuler, D
    Brooks, LD
    Chakravarti, A
    Collins, FS
    Daly, MJ
    Donnelly, P
    Gibbs, RA
    Belmont, JW
    Boudreau, A
    Leal, SM
    Hardenbol, P
    Pasternak, S
    Wheeler, DA
    Willis, TD
    Yu, FL
    Yang, HM
    Zeng, CQ
    Gao, Y
    Hu, HR
    Hu, WT
    Li, CH
    Lin, W
    Liu, SQ
    Pan, H
    Tang, XL
    Wang, J
    Wang, W
    Yu, J
    Zhang, B
    Zhang, QR
    Zhao, HB
    Zhao, H
    Zhou, J
    Gabriel, SB
    Barry, R
    Blumenstiel, B
    Camargo, A
    Defelice, M
    Faggart, M
    Goyette, M
    Gupta, S
    Moore, J
    Nguyen, H
    Onofrio, RC
    Parkin, M
    Roy, J
    Stahl, E
    Winchester, E
    Ziaugra, L
    Shen, Y
    [J]. NATURE, 2005, 437 (7063) : 1299 - 1320
  • [2] Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease
    Barrett, Jeffrey C.
    Hansoul, Sarah
    Nicolae, Dan L.
    Cho, Judy H.
    Duerr, Richard H.
    Rioux, John D.
    Brant, Steven R.
    Silverberg, Mark S.
    Taylor, Kent D.
    Barmada, M. Michael
    Bitton, Alain
    Dassopoulos, Themistocles
    Datta, Lisa Wu
    Green, Todd
    Griffiths, Anne M.
    Kistner, Emily O.
    Murtha, Michael T.
    Regueiro, Miguel D.
    Rotter, Jerome I.
    Schumm, L. Philip
    Steinhart, A. Hillary
    Targan, Stephan R.
    Xavier, Ramnik J.
    Libioulle, Cecile
    Sandor, Cynthia
    Lathrop, Mark
    Belaiche, Jacques
    Dewit, Olivier
    Gut, Ivo
    Heath, Simon
    Laukens, Debby
    Mni, Myriam
    Rutgeerts, Paul
    Van Gossum, Andre
    Zelenika, Diana
    Franchimont, Denis
    Hugot, Jean-Pierre
    de Vos, Martine
    Vermeire, Severine
    Louis, Edouard
    Cardon, Lon R.
    Anderson, Carl A.
    Drummond, Hazel
    Nimmo, Elaine
    Ahmad, Tariq
    Prescott, Natalie J.
    Onnie, Clive M.
    Fisher, Sheila A.
    Marchini, Jonathan
    Ghori, Jilur
    [J]. NATURE GENETICS, 2008, 40 (08) : 955 - 962
  • [3] Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering
    Browning, Sharon R.
    Browning, Brian L.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) : 1084 - 1097
  • [4] Missing data imputation and haplotype phase inference for genome-wide association studies
    Browning, Sharon R.
    [J]. HUMAN GENETICS, 2008, 124 (05) : 439 - 450
  • [5] Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
    Burton, Paul R.
    Clayton, David G.
    Cardon, Lon R.
    Craddock, Nick
    Deloukas, Panos
    Duncanson, Audrey
    Kwiatkowski, Dominic P.
    McCarthy, Mark I.
    Ouwehand, Willem H.
    Samani, Nilesh J.
    Todd, John A.
    Donnelly, Peter
    Barrett, Jeffrey C.
    Davison, Dan
    Easton, Doug
    Evans, David
    Leung, Hin-Tak
    Marchini, Jonathan L.
    Morris, Andrew P.
    Spencer, Chris C. A.
    Tobin, Martin D.
    Attwood, Antony P.
    Boorman, James P.
    Cant, Barbara
    Everson, Ursula
    Hussey, Judith M.
    Jolley, Jennifer D.
    Knight, Alexandra S.
    Koch, Kerstin
    Meech, Elizabeth
    Nutland, Sarah
    Prowse, Christopher V.
    Stevens, Helen E.
    Taylor, Niall C.
    Walters, Graham R.
    Walker, Neil M.
    Watkins, Nicholas A.
    Winzer, Thilo
    Jones, Richard W.
    McArdle, Wendy L.
    Ring, Susan M.
    Strachan, David P.
    Pembrey, Marcus
    Breen, Gerome
    St Clair, David
    Caesar, Sian
    Gordon-Smith, Katherine
    Jones, Lisa
    Fraser, Christine
    Green, Elain K.
    [J]. NATURE, 2007, 447 (7145) : 661 - 678
  • [6] Cann HM, 2002, SCIENCE, V296, P261
  • [7] A worldwide survey of haplotype variation and linkage disequilibrium in the human genome
    Conrad, Donald F.
    Jakobsson, Mattias
    Coop, Graham
    Wen, Xiaoquan
    Wall, Jeffrey D.
    Rosenberg, Noah A.
    Pritchard, Jonathan K.
    [J]. NATURE GENETICS, 2006, 38 (11) : 1251 - 1260
  • [8] A second generation human haplotype map of over 3.1 million SNPs
    Frazer, Kelly A.
    Ballinger, Dennis G.
    Cox, David R.
    Hinds, David A.
    Stuve, Laura L.
    Gibbs, Richard A.
    Belmont, John W.
    Boudreau, Andrew
    Hardenbol, Paul
    Leal, Suzanne M.
    Pasternak, Shiran
    Wheeler, David A.
    Willis, Thomas D.
    Yu, Fuli
    Yang, Huanming
    Zeng, Changqing
    Gao, Yang
    Hu, Haoran
    Hu, Weitao
    Li, Chaohua
    Lin, Wei
    Liu, Siqi
    Pan, Hao
    Tang, Xiaoli
    Wang, Jian
    Wang, Wei
    Yu, Jun
    Zhang, Bo
    Zhang, Qingrun
    Zhao, Hongbin
    Zhao, Hui
    Zhou, Jun
    Gabriel, Stacey B.
    Barry, Rachel
    Blumenstiel, Brendan
    Camargo, Amy
    Defelice, Matthew
    Faggart, Maura
    Goyette, Mary
    Gupta, Supriya
    Moore, Jamie
    Nguyen, Huy
    Onofrio, Robert C.
    Parkin, Melissa
    Roy, Jessica
    Stahl, Erich
    Winchester, Ellen
    Ziaugra, Liuda
    Altshuler, David
    Shen, Yan
    [J]. NATURE, 2007, 449 (7164) : 851 - U3
  • [9] The portability of tagSNPs across populations:: A worldwide survey
    González-Neira, A
    Ke, XY
    Lao, O
    Calafell, F
    Navarro, A
    Comas, D
    Cann, H
    Bumpstead, S
    Ghori, J
    Hunt, S
    Deloukas, P
    Dunham, I
    Cardon, LR
    Bertranpetit, J
    [J]. GENOME RESEARCH, 2006, 16 (03) : 323 - 330
  • [10] On transferability of genome-wide tagSNPs
    Gu, C. Charles
    Yu, K.
    Ketkar, S.
    Templeton, Alan R.
    Rao, D. C.
    [J]. GENETIC EPIDEMIOLOGY, 2008, 32 (02) : 89 - 97