Likelihood-based inference on haplotype effects in genetic association studies

被引:103
作者
Lin, DY [1 ]
Zeng, D [1 ]
机构
[1] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
基金
美国国家卫生研究院;
关键词
case-control study; gene-environment interaction; Hardy-Weinberg equilibrium; missing data; single nucleotide polymorphism; unphased genotype;
D O I
10.1198/016214505000000808
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A haplotype is a specific sequence of nucleotides on a single chromosome. The population associations between haplotypes and disease phenotypes provide critical information about the genetic basis of complex human diseases. Standard genotyping techniques cannot distinguish the two homologous chromosomes of an individual, so only the unphased genotype (i.e., the combination of the two homologous haplotypes) is directly observable. Statistical inference about haplotype-phenotype associations based on unphased genotype data presents an intriguing missing-data problem, especially when the sampling depends on the disease status. The objective of this article is to provide a systematic and rigorous treatment of this problem. All commonly used study designs. including cross-sectional. case-control, and cohort studies, are considered. The phenotype can be a disease indicator, a quantitative trait. or a potentially censored time-to-disease variable. The effects of haplotypes on the phenotype are formulated through flexible regression models. which can accommodate various genetic mechanisms and gene-environment interactions. Appropriate likelihoods are constructed that may involve high-dimensional parameters. The identifiability of the parameters and the consistency, asymptotic normality, and efficiency of the maximum likelihood estimators are established. Efficient and reliable numerical algorithms are developed. Simulation studies show that the likelihood-based procedures perform well in practical settings. An application to the Finland-United States Investigation of NIDDM Genetics Study is provided. Areas in need of further development are discussed.
引用
收藏
页码:89 / 104
页数:16
相关论文
共 51 条
[41]   Mapping genes for NIDDM -: Design of the Finland United States Investigation of NIDDM Genetics (FUSION) study [J].
Valle, T ;
Ehnholm, C ;
Tuomilehto, J ;
Blaschak, J ;
Bergman, RN ;
Langefeld, CD ;
Ghosh, S ;
Watanabe, RM ;
Hauser, ER ;
Magnuson, V ;
Eriksson, J ;
Ally, DS ;
Nylund, SJ ;
Hagopian, WA ;
Kohtamäki, K ;
Ross, E ;
Toivanen, L ;
Buchanan, TA ;
Vidgren, G ;
Collins, F ;
Tuomilehto-Wolf, E ;
Boehnke, M .
DIABETES CARE, 1998, 21 (06) :949-958
[42]   Consistency of semiparametric maximum likelihood estimators for two-phase sampling [J].
van der Vaart, A ;
Wellner, JA .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2001, 29 (02) :269-288
[43]  
van der Vaart AW, 1996, WEAK CONVERGENCE EMP
[44]   The sequence of the human genome [J].
Venter, JC ;
Adams, MD ;
Myers, EW ;
Li, PW ;
Mural, RJ ;
Sutton, GG ;
Smith, HO ;
Yandell, M ;
Evans, CA ;
Holt, RA ;
Gocayne, JD ;
Amanatides, P ;
Ballew, RM ;
Huson, DH ;
Wortman, JR ;
Zhang, Q ;
Kodira, CD ;
Zheng, XQH ;
Chen, L ;
Skupski, M ;
Subramanian, G ;
Thomas, PD ;
Zhang, JH ;
Miklos, GLG ;
Nelson, C ;
Broder, S ;
Clark, AG ;
Nadeau, C ;
McKusick, VA ;
Zinder, N ;
Levine, AJ ;
Roberts, RJ ;
Simon, M ;
Slayman, C ;
Hunkapiller, M ;
Bolanos, R ;
Delcher, A ;
Dew, I ;
Fasulo, D ;
Flanigan, M ;
Florea, L ;
Halpern, A ;
Hannenhalli, S ;
Kravitz, S ;
Levy, S ;
Mobarry, C ;
Reinert, K ;
Remington, K ;
Abu-Threideh, J ;
Beasley, E .
SCIENCE, 2001, 291 (5507) :1304-+
[45]   On the use of DNA pooling to estimate haplotype frequencies [J].
Wang, S ;
Kidd, KK ;
Zhao, HY .
GENETIC EPIDEMIOLOGY, 2003, 24 (01) :74-82
[46]  
Weir B. S., 1996, GENETIC DATA ANAL
[47]   Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals [J].
Zaykin, DV ;
Westfall, PH ;
Young, SS ;
Karnoub, MA ;
Wagner, MJ ;
Ehm, MG .
HUMAN HEREDITY, 2002, 53 (02) :79-91
[48]   Estimating haplotype-disease associations with pooled genotype data [J].
Zeng, D ;
Lin, DY .
GENETIC EPIDEMIOLOGY, 2005, 28 (01) :70-82
[49]  
ZENG D, 2005, SEMIPARAMETRIC TRANS
[50]   Comparisons of two methods for haplotype reconstruction and haplotype frequency estimation from population data [J].
Zhang, SL ;
Pakstis, AJ ;
Kidd, KK ;
Zhao, HY .
AMERICAN JOURNAL OF HUMAN GENETICS, 2001, 69 (04) :906-912