Accounting for haplotype uncertainty in matched association studies: A comparison of simple and flexible techniques

被引:116
作者
Kraft, P
Cox, DG
Paynter, RA
Hunter, D
De Vivo, I
机构
[1] Harvard Univ, Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[3] Harvard Univ, Sch Publ Hlth, Dept Nutr, Boston, MA 02115 USA
[4] Brigham & Womens Hosp, Channing Lab, Boston, MA 02115 USA
[5] Harvard Univ, Sch Med, Boston, MA 02115 USA
关键词
haplotypes; population-based matched case-control data; gene-environment interaction;
D O I
10.1002/gepi.20061
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Population-based case-control studies measuring associations between haplotypes of single nucleotide polymorphisms (SNPs) are increasingly popular, in part because haplotypes of a few "tagging" SNPs may serve as surrogates for variation in relatively large sections of the genome. Due to current technological limitations, haplotypes in cases and controls must be inferred from unphased genotypic data. Using individual-specific inferred haplotypes as covariates in standard epidemiologic analyses (e.g., conditional logistic regression) is an attractive analysis strategy, as it allows adjustment for nongenetic covariates, provides omnibus and haplotype-specific tests of association, and can estimate haplotype and haplotype x environment interaction effects. In principle, some adjustment for the uncertainty in inferred haplotypes should be made. Via simulation, we compare the performance (bias and mean squared error of haplotype and haplotype x environment interaction effect estimates) of several analytic strategies using inferred haplotypes in the context of matched case-control data. These strategies include using only the most likely haplotype assignment, the expectation substitution approach described by Stram et al. ([2003b] Hum. Hered. 55:179-190) and others, and an improper version of multiple imputation. For relatively uncomplicated haplotype structures and moderate haplotype relative risks (:! 2), all methods performed comparably well (small bias with appropriately-sized confidence intervals). For larger relative risks, the most likely haplotype and multiple imputation strategies showed noticeable bias towards the null; the expectation substitution strategy still performed well. When there was more uncertainty in the inferred haplotypes, the most likely and multiple imputation strategies showed even more bias towards the null, while the expectation substitution method had slightly smaller than nominal confidence intervals for larger relative risks ( !:5). An application to progesterone-receptor haplotypes and endometrial cancer further illustrates that the performance of all these methods depends on how well the observed haplotypes "tag" the unobserved causal variant. (c) 2005 Wiley-Liss, Inc.
引用
收藏
页码:261 / 272
页数:12
相关论文
共 33 条
[1]   Haplotypes vs single marker linkage disequilibrium tests:: what do we gain? (Reprinted European Journal of Human Genetics, Vol 4, pg 291-300, 2001) [J].
Akey, Joshua ;
Jin, Li ;
Xiong, Momiao .
EUROPEAN JOURNAL OF HUMAN GENETICS, 2017, 25 :S51-S58
[2]  
Carroll R., 1998, ENCY BIOSTATISTICS, V3, P2491
[3]   Detecting disease associations due to linkage disequilibrium using haplotype tags: A class of tests and the determinants of statistical power [J].
Chapman, JM ;
Cooper, JD ;
Todd, JA ;
Clayton, DG .
HUMAN HEREDITY, 2003, 56 (1-3) :18-31
[4]   Genome screens using linkage disequilibrium tests: Optimal marker characteristics and feasibility [J].
Chapman, NH ;
Wijsman, EM .
AMERICAN JOURNAL OF HUMAN GENETICS, 1998, 63 (06) :1872-1885
[5]   Fine genetic mapping using haplotype analysis and the missing data problem [J].
Chiano, MN ;
Clayton, DG .
ANNALS OF HUMAN GENETICS, 1998, 62 :55-60
[6]   High-resolution haplotype structure in the human genome [J].
Daly, MJ ;
Rioux, JD ;
Schaffner, SE ;
Hudson, TJ ;
Lander, ES .
NATURE GENETICS, 2001, 29 (02) :229-232
[7]   A functional polymorphism in the promoter of the progesterone receptor gene associated with endometrial cancer risk [J].
De Vivo, I ;
Huggins, GS ;
Hankinson, SE ;
Lescault, PJ ;
Boezen, M ;
Colditz, GA ;
Hunter, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (19) :12263-12268
[8]   Inference on haplotype effects in case-control studies using unphased genotype data [J].
Epstein, MP ;
Satten, GA .
AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (06) :1316-1329
[9]  
EXCOFFIER L, 1995, MOL BIOL EVOL, V12, P921
[10]   Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data [J].
Fallin, D ;
Schork, NJ .
AMERICAN JOURNAL OF HUMAN GENETICS, 2000, 67 (04) :947-959