MaCH: Using Sequence and Genotype Data to Estimate Haplotypes and Unobserved Genotypes

被引:1484
作者
Li, Yun [2 ]
Willer, Cristen J. [1 ]
Ding, Jun [1 ]
Scheet, Paul [3 ]
Abecasis, Goncalo R. [1 ]
机构
[1] Univ Michigan, Sch Publ Hlth, Dept Biostat, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[2] Univ N Carolina, Dept Biostat, Dept Genet, Chapel Hill, NC USA
[3] Univ Texas MD Anderson Canc Ctr, Dept Epidemiol, Houston, TX 77030 USA
关键词
imputation; haplotyping; sequencing; WHOLE-GENOME ASSOCIATION; MULTIPOINT LINKAGE ANALYSIS; WIDE ASSOCIATION; DISEQUILIBRIUM; LOCI; IMPUTATION; ALGORITHM; INFERENCE; FASTER; MODEL;
D O I
10.1002/gepi.20533
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Genome-wide association studies (GWAS) can identify common alleles that contribute to complex disease susceptibility. Despite the large number of SNPs assessed in each study, the effects of most common SNPs must be evaluated indirectly using either genotyped markers or haplotypes thereof as proxies. We have previously implemented a computationally efficient Markov Chain framework for genotype imputation and haplotyping in the freely available MaCH software package. The approach describes sampled chromosomes as mosaics of each other and uses available genotype and shotgun sequence data to estimate unobserved genotypes and haplotypes, together with useful measures of the quality of these estimates. Our approach is already widely used to facilitate comparison of results across studies as well as meta-analyses of GWAS. Here, we use simulations and experimental genotypes to evaluate its accuracy and utility, considering choices of genotyping panels, reference panel configurations, and designs where genotyping is replaced with shotgun sequencing. Importantly, we show that genotype imputation not only facilitates cross study analyses but also increases power of genetic association studies. We show that genotype imputation of common variants using HapMap haplotypes as a reference is very accurate using either genome-wide SNP data or smaller amounts of data typical in fine-mapping studies. Furthermore, we show the approach is applicable in a variety of populations. Finally, we illustrate how association analyses of unobserved variants will benefit from ongoing advances such as larger HapMap reference panels and whole genome shotgun sequencing technologies. Genet. Epidemiol. 34:816-834, 2010. (C) 2010 Wiley-Liss, Inc.
引用
收藏
页码:816 / 834
页数:19
相关论文
共 52 条
  • [1] Merlin-rapid analysis of dense genetic maps using sparse gene flow trees
    Abecasis, GR
    Cherny, SS
    Cookson, WO
    Cardon, LR
    [J]. NATURE GENETICS, 2002, 30 (01) : 97 - 101
  • [2] A haplotype map of the human genome
    Altshuler, D
    Brooks, LD
    Chakravarti, A
    Collins, FS
    Daly, MJ
    Donnelly, P
    Gibbs, RA
    Belmont, JW
    Boudreau, A
    Leal, SM
    Hardenbol, P
    Pasternak, S
    Wheeler, DA
    Willis, TD
    Yu, FL
    Yang, HM
    Zeng, CQ
    Gao, Y
    Hu, HR
    Hu, WT
    Li, CH
    Lin, W
    Liu, SQ
    Pan, H
    Tang, XL
    Wang, J
    Wang, W
    Yu, J
    Zhang, B
    Zhang, QR
    Zhao, HB
    Zhao, H
    Zhou, J
    Gabriel, SB
    Barry, R
    Blumenstiel, B
    Camargo, A
    Defelice, M
    Faggart, M
    Goyette, M
    Gupta, S
    Moore, J
    Nguyen, H
    Onofrio, RC
    Parkin, M
    Roy, J
    Stahl, E
    Winchester, E
    Ziaugra, L
    Shen, Y
    [J]. NATURE, 2005, 437 (7063) : 1299 - 1320
  • [3] [Anonymous], 2003, The Statistical Evaluation of Medical Tests for Classification and Prediction
  • [4] Evaluating coverage of genome-wide association studies
    Barrett, Jeffrey C.
    Cardon, Lon R.
    [J]. NATURE GENETICS, 2006, 38 (06) : 659 - 662
  • [5] Baum L. E., 1972, Inequalities, V3, P1
  • [6] Whole-genome re-sequencing
    Bentley, David R.
    [J]. CURRENT OPINION IN GENETICS & DEVELOPMENT, 2006, 16 (06) : 545 - 552
  • [7] Biernacka Joanna M, 2009, BMC Proc, V3 Suppl 7, pS5
  • [8] Multilocus association mapping using variable-length Markov chains
    Browning, Sharon R.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 78 (06) : 903 - 913
  • [9] In silico method for inferring genotypes in pedigrees
    Burdick, Joshua T.
    Chen, Wei-Min
    Abecasis, Goncalo R.
    Cheung, Vivian G.
    [J]. NATURE GENETICS, 2006, 38 (09) : 1002 - 1004
  • [10] Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans
    Carlson, CS
    Eberle, MA
    Rieder, MJ
    Smith, JD
    Kruglyak, L
    Nickerson, DA
    [J]. NATURE GENETICS, 2003, 33 (04) : 518 - 521