Correlation-based inference for linkage disequilibrium with multiple alleles

被引:54
作者
Zaykin, Dmitri V. [1 ]
Pudovkin, Alexander [2 ]
Weir, Bruce S. [3 ]
机构
[1] Natl Inst Environm Hlth Sci, NIH, Res Triangle Pk, NC 27709 USA
[2] Russian Acad Sci, Inst Marine Biol, Vladivostok 690041, Russia
[3] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1534/genetics.108.089409
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The correlation between alleles at a pair of genetic loci is a measure of linkage disequilibrium. The square of the sample correlation multiplied by sample size provides the usual test statistic for the hypothesis of no disequilibrium for loci with two alleles and this relation has proved useful for study design and marker selection. Nevertheless, this relation holds only in a diallelic case, and an extension to multiple alleles has not been made. Here we introduce a similar statistic, R-z, which leads to correlaiton -based test for loci with multiple alleles: for a pair of loci with k and m alleles, and a sample of n individuals, the approximate distribution of n(k-1) (m-1)/(km)R-2 under independence between loci is chi(2)((k-1)(m-1)). One advantage of this statistic is that is can be interpreted as the total correlation between a pair of loci. When the phase of two-locus genotypes is know, the approach is equivalent to a test for the overall correlation between rows and columns in a contingency table. In the phase-known case, R-2 is strong competitor to approaches such as Pearson's chi square, Fisher's exact test, and a test based on Cressie and Read's power divergence statistic. We combine this approach with our previous composite-disequilibrium measures to address the case when the genotypic phase is unknown. Calculation os the new multiallele test statistic and its P-value is very simple and utilizes the approximate distribution of R-2. We provide a computer program that evaluates approximate as well as "exact" permutational P-values.
引用
收藏
页码:533 / 545
页数:13
相关论文
共 40 条
[1]   Monte Carlo evaluation of resampling-based hypothesis tests [J].
Boos, DD ;
Zhang, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2000, 95 (450) :486-492
[2]   SOME THEOREMS ON QUADRATIC FORMS APPLIED IN THE STUDY OF ANALYSIS OF VARIANCE PROBLEMS .1. EFFECT OF INEQUALITY OF VARIANCE IN THE ONE-WAY CLASSIFICATION [J].
BOX, GEP .
ANNALS OF MATHEMATICAL STATISTICS, 1954, 25 (02) :290-302
[3]  
CRESSIE N, 1984, J ROY STAT SOC B MET, V46, P440
[4]  
Evett I. W., 1998, INTERPRETING DNA EVI
[5]  
EXCOFFIER L, 1995, MOL BIOL EVOL, V12, P921
[6]  
FIENBERG SE, 1979, J ROY STAT SOC B MET, V41, P54
[7]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[8]  
HEDRICK PW, 1987, GENETICS, V117, P331
[9]   ESTIMATION OF LINKAGE DISEQUILIBRIUM IN RANDOMLY MATING POPULATIONS [J].
HILL, WG .
HEREDITY, 1974, 33 (OCT) :229-239
[10]   TESTS FOR ASSOCIATION OF GENE-FREQUENCIES AT SEVERAL LOCI IN RANDOM MATING DIPLOID POPULATIONS [J].
HILL, WG .
BIOMETRICS, 1975, 31 (04) :881-888