Enhancements to the ADMIXTURE algorithm for individual ancestry estimation

被引:1048
作者
Alexander, David H. [1 ]
Lange, Kenneth [1 ,2 ,3 ]
机构
[1] Univ Calif Los Angeles, Dept Biomath, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA USA
[3] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA USA
来源
BMC BIOINFORMATICS | 2011年 / 12卷
关键词
MODEL;
D O I
10.1186/1471-2105-12-246
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The estimation of individual ancestry from genetic data has become essential to applied population genetics and genetic epidemiology. Software programs for calculating ancestry estimates have become essential tools in the geneticist's analytic arsenal. Results: Here we describe four enhancements to ADMIXTURE, a high-performance tool for estimating individual ancestries and population allele frequencies from SNP (single nucleotide polymorphism) data. First, ADMIXTURE can be used to estimate the number of underlying populations through cross-validation. Second, individuals of known ancestry can be exploited in supervised learning to yield more precise ancestry estimates. Third, by penalizing small admixture coefficients for each individual, one can encourage model parsimony, often yielding more interpretable results for small datasets or datasets with large numbers of ancestral populations. Finally, by exploiting multiple processors, large datasets can be analyzed even more rapidly. Conclusions: The enhancements we have described make ADMIXTURE a more accurate, efficient, and versatile tool for ancestry estimation.
引用
收藏
页数:6
相关论文
共 12 条
[1]   Fast model-based estimation of ancestry in unrelated individuals [J].
Alexander, David H. ;
Novembre, John ;
Lange, Kenneth .
GENOME RESEARCH, 2009, 19 (09) :1655-1664
[2]   Integrating common and rare genetic variation in diverse human populations [J].
Altshuler, David M. ;
Gibbs, Richard A. ;
Peltonen, Leena ;
Dermitzakis, Emmanouil ;
Schaffner, Stephen F. ;
Yu, Fuli ;
Bonnen, Penelope E. ;
de Bakker, Paul I. W. ;
Deloukas, Panos ;
Gabriel, Stacey B. ;
Gwilliam, Rhian ;
Hunt, Sarah ;
Inouye, Michael ;
Jia, Xiaoming ;
Palotie, Aarno ;
Parkin, Melissa ;
Whittaker, Pamela ;
Chang, Kyle ;
Hawes, Alicia ;
Lewis, Lora R. ;
Ren, Yanru ;
Wheeler, David ;
Muzny, Donna Marie ;
Barnes, Chris ;
Darvishi, Katayoon ;
Hurles, Matthew ;
Korn, Joshua M. ;
Kristiansson, Kati ;
Lee, Charles ;
McCarroll, Steven A. ;
Nemesh, James ;
Keinan, Alon ;
Montgomery, Stephen B. ;
Pollack, Samuela ;
Price, Alkes L. ;
Soranzo, Nicole ;
Gonzaga-Jauregui, Claudia ;
Anttila, Verneri ;
Brodeur, Wendy ;
Daly, Mark J. ;
Leslie, Stephen ;
McVean, Gil ;
Moutsianas, Loukas ;
Nguyen, Huy ;
Zhang, Qingrun ;
Ghori, Mohammed J. R. ;
McGinnis, Ralph ;
McLaren, William ;
Takeuchi, Fumihiko ;
Grossman, Sharon R. .
NATURE, 2010, 467 (7311) :52-58
[3]  
[Anonymous], 1983, Generalized Linear Models
[4]   A METHOD FOR QUANTIFYING DIFFERENTIATION BETWEEN POPULATIONS AT MULTI-ALLELIC LOCI AND ITS IMPLICATIONS FOR INVESTIGATING IDENTITY AND PATERNITY [J].
BALDING, DJ ;
NICHOLS, RA .
GENETICA, 1995, 96 (1-2) :3-12
[5]   Enhancing Sparsity by Reweighted l1 Minimization [J].
Candes, Emmanuel J. ;
Wakin, Michael B. ;
Boyd, Stephen P. .
JOURNAL OF FOURIER ANALYSIS AND APPLICATIONS, 2008, 14 (5-6) :877-905
[6]   OpenMP: An industry standard API for shared-memory programming [J].
Dagum, L ;
Menon, R .
IEEE COMPUTATIONAL SCIENCE & ENGINEERING, 1998, 5 (01) :46-55
[7]   Population structure and eigenanalysis [J].
Patterson, Nick ;
Price, Alkes L. ;
Reich, David .
PLOS GENETICS, 2006, 2 (12) :2074-2093
[8]  
Pritchard JK, 2000, GENETICS, V155, P945
[9]   A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase [J].
Scheet, P ;
Stephens, M .
AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 78 (04) :629-644
[10]   Estimation of individual admixture: Analytical and study design considerations [J].
Tang, H ;
Peng, J ;
Wang, P ;
Risch, NJ .
GENETIC EPIDEMIOLOGY, 2005, 28 (04) :289-301