A simple and improved correction for population stratification in case-control studies

被引:117
作者
Epstein, Michael P.
Allen, Andrew S.
Satten, Glen A.
机构
[1] Emory Univ, Sch Med, Dept Human Genet, Atlanta, GA 30322 USA
[2] Ctr Dis Control & Prevent, Atlanta, GA USA
[3] Duke Univ, Dept Biostat & Bioinformat, Durham, NC 27706 USA
[4] Duke Univ, Duke Clin Res Inst, Durham, NC 27706 USA
关键词
D O I
10.1086/516842
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Population stratification remains an important issue in case-control studies of disease-marker association, even within populations considered to be genetically homogeneous. Campbell et al. ( Nature Genetics 2005; 37: 868 - 872) illustrated this by showing that stratification induced a spurious association between the lactase gene (LCT) and tall/short status in a European American sample. Furthermore, existing approaches for controlling stratification by use of substructure-informative loci ( e. g., genomic control, structured association, and principal components) could not resolve this confounding. To address this problem, we propose a simple two-step procedure. In the first step, we model the odds of disease, given data on substructure-informative loci ( excluding the test locus). For each participant, we use this model to calculate a stratification score, which is that participant's estimated odds of disease calculated using his or her substructure- informative - loci data in the disease-odds model. In the second step, we assign subjects to strata defined by stratification score and then test for association between the disease and the test locus within these strata. The resulting association test is valid even in the presence of population stratification. Our approach is computationally simple and less model dependent than are existing approaches for controlling stratification. To illustrate these properties, we apply our approach to the data from Campbell et al. and find no association between the LCT locus and tall/short status. Using simulated data, we show that our approach yields a more appropriate correction for stratification than does principal components or genomic control.
引用
收藏
页码:921 / 930
页数:10
相关论文
共 29 条
[1]  
AGODINI R, 2001, EXPT OPTION LOOK DRO
[2]   Interrogating a high-density SNP map for signatures of natural selection [J].
Akey, JM ;
Zhang, G ;
Zhang, K ;
Jin, L ;
Shriver, MD .
GENOME RESEARCH, 2002, 12 (12) :1805-1814
[3]   Does 401(k) eligibility increase saving? Evidence from propensity score subclassification [J].
Benjamin, DJ .
JOURNAL OF PUBLIC ECONOMICS, 2003, 87 (5-6) :1259-1290
[4]   Demonstrating stratification in a European American population [J].
Campbell, CD ;
Ogburn, EL ;
Lunetta, KL ;
Lyon, HN ;
Freedman, ML ;
Groop, LC ;
Altshuler, D ;
Ardlie, KG ;
Hirschhorn, JN .
NATURE GENETICS, 2005, 37 (08) :868-872
[5]   Qualitative semi-parametric test for genetic associations in case-control designs under structured populations [J].
Chen, HS ;
Zhu, X ;
Zhao, H ;
Zhang, S .
ANNALS OF HUMAN GENETICS, 2003, 67 :250-264
[6]   Genomic control for association studies [J].
Devlin, B ;
Roeder, K .
BIOMETRICS, 1999, 55 (04) :997-1004
[7]   Genomic control, a new approach to genetic-based association studies [J].
Devlin, B ;
Roeder, K ;
Wasserman, L .
THEORETICAL POPULATION BIOLOGY, 2001, 60 (03) :155-166
[8]  
Friedman J, 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[9]   Control of confounding of genetic associations in stratified populations [J].
Hoggart, CJ ;
Parra, EJ ;
Shriver, MD ;
Bonilla, C ;
Kittles, RA ;
Clayton, DG ;
McKeigue, PM .
AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 72 (06) :1492-1504
[10]   Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study [J].
Lunceford, JK ;
Davidian, M .
STATISTICS IN MEDICINE, 2004, 23 (19) :2937-2960