Variance component model to account for sample structure in genome-wide association studies

被引:2003
作者
Kang, Hyun Min [2 ,3 ]
Sul, Jae Hoon [4 ]
Service, Susan K. [5 ]
Zaitlen, Noah A. [6 ]
Kong, Sit-yee [5 ]
Freimer, Nelson B. [5 ]
Sabatti, Chiara [1 ]
Eskin, Eleazar [4 ,7 ]
机构
[1] Stanford Univ, Dept Hlth Res & Policy, Sch Med, Stanford, CA 94305 USA
[2] Univ Michigan, Dept Biostat, Ctr Stat Genet, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Sch Med, Ctr Computat Med & Bioinformat, Ann Arbor, MI USA
[4] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
[5] Univ Calif Los Angeles, Ctr Neurobehav Genet, Los Angeles, CA USA
[6] Harvard Univ, Sch Publ Hlth, Dept Epidemiol & Biostat, Boston, MA 02115 USA
[7] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
CANCER SUSCEPTIBILITY LOCI; POPULATION-STRUCTURE; FOUNDER POPULATION; PAIRWISE RELATEDNESS; MISSING HERITABILITY; QUANTITATIVE TRAITS; COMPLEX DISEASES; TESTS; STRATIFICATION; INDIVIDUALS;
D O I
10.1038/ng.548
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.
引用
收藏
页码:348 / U110
页数:9
相关论文
共 61 条
[1]  
Agresti A., 1990, CATEGORICAL DATA ANA
[2]   Newly discovered breast cancer susceptibility loci on 3p24 and 17q23.2 [J].
Ahmed, Shahana ;
Thomas, Gilles ;
Ghoussaini, Maya ;
Healey, Catherine S. ;
Humphreys, Manjeet K. ;
Platte, Radka ;
Morrison, Jonathan ;
Maranian, Melanie ;
Pooley, Karen A. ;
Luben, Robert ;
Eccles, Diana ;
Evans, D. Gareth ;
Fletcher, Olivia ;
Johnson, Nichola ;
Silva, Isabel dos Santos ;
Peto, Julian ;
Stratton, Michael R. ;
Rahman, Nazneen ;
Jacobs, Kevin ;
Prentice, Ross ;
Anderson, Garnet L. ;
Rajkovic, Aleksandar ;
Curb, J. David ;
Ziegler, Regina G. ;
Berg, Christine D. ;
Buys, Saundra S. ;
McCarty, Catherine A. ;
Feigelson, Heather Spencer ;
Calle, Eugenia E. ;
Thun, Michael J. ;
Diver, W. Ryan ;
Bojesen, Stig ;
Nordestgaard, Borge G. ;
Flyger, Henrik ;
Doerk, Thilo ;
Schuermann, Peter ;
Hillemanns, Peter ;
Karstens, Johann H. ;
Bogdanova, Natalia V. ;
Antonenkova, Natalia N. ;
Zalutsky, Iosif V. ;
Bermisheva, Marina ;
Fedorova, Sardana ;
Khusnutdinova, Elza ;
Kang, Daehee ;
Yoo, Keun-Young ;
Noh, Dong Young ;
Ahn, Sei-Hyun ;
Devilee, Peter ;
van Asperen, Christi J. .
NATURE GENETICS, 2009, 41 (05) :585-590
[3]  
[Anonymous], 1998, Genetics and Analysis of Quantitative Traits (Sinauer)
[4]  
[Anonymous], 2002, Mathematical and statistical methods for genetic analysis
[5]   TESTS FOR LINEAR TRENDS IN PROPORTIONS AND FREQUENCIES [J].
ARMITAGE, P .
BIOMETRICS, 1955, 11 (03) :375-386
[6]   Association studies for quantitative traits in structured populations [J].
Bacanu, SA ;
Devlin, B ;
Roeder, K .
GENETIC EPIDEMIOLOGY, 2002, 22 (01) :78-93
[7]   A METHOD FOR QUANTIFYING DIFFERENTIATION BETWEEN POPULATIONS AT MULTI-ALLELIC LOCI AND ITS IMPLICATIONS FOR INVESTIGATING IDENTITY AND PATERNITY [J].
BALDING, DJ ;
NICHOLS, RA .
GENETICA, 1995, 96 (1-2) :3-12
[8]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[9]   Family-based association tests for genomewide association scans [J].
Chen, Wei-Min ;
Abecasis, Goncalo R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (05) :913-926
[10]   A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits [J].
Cho, Yoon Shin ;
Go, Min Jin ;
Kim, Young Jin ;
Heo, Jee Yeon ;
Oh, Ji Hee ;
Ban, Hyo-Jeong ;
Yoon, Dankyu ;
Lee, Mi Hee ;
Kim, Dong-Joon ;
Park, Miey ;
Cha, Seung-Hun ;
Kim, Jun-Woo ;
Han, Bok-Ghee ;
Min, Haesook ;
Ahn, Younjhin ;
Park, Man Suk ;
Han, Hye Ree ;
Jang, Hye-Yoon ;
Cho, Eun Young ;
Lee, Jong-Eun ;
Cho, Nam H. ;
Shin, Chol ;
Park, Taesung ;
Park, Ji Wan ;
Lee, Jong-Keuk ;
Cardon, Lon ;
Clarke, Geraldine ;
McCarthy, Mark I. ;
Lee, Jong-Young ;
Lee, Jong-Koo ;
Oh, Bermseok ;
Kim, Hyung-Lae .
NATURE GENETICS, 2009, 41 (05) :527-534