AWclust: point-and-click software for non-parametric population structure analysis

被引:66
作者
Gao, Xiaoyi [1 ]
Starmer, Joshua D. [2 ,3 ]
机构
[1] Univ Miami, Miller Sch Med, Miami Inst Human Genom, Miami, FL 33136 USA
[2] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[3] Univ N Carolina, Curriculum Toxicol, Chapel Hill, NC 27599 USA
关键词
D O I
10.1186/1471-2105-9-77
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Population structure analysis is important to genetic association studies and evolutionary investigations. Parametric approaches, e. g. STRUCTURE and L-POP, usually assume Hardy-Weinberg equilibrium (HWE) and linkage equilibrium among loci in sample population individuals. However, the assumptions may not hold and allele frequency estimation may not be accurate in some data sets. The improved version of STRUCTURE (version 2.1) can incorporate linkage information among loci but is still sensitive to high background linkage disequilibrium. Nowadays, large-scale single nucleotide polymorphisms (SNPs) are becoming popular in genetic studies. Therefore, it is imperative to have software that makes full use of these genetic data to generate inference even when model assumptions do not hold or allele frequency estimation suffers from high variation. Results: We have developed point-and-click software for non-parametric population structure analysis distributed as an R package. The software takes advantage of the large number of SNPs available to categorize individuals into ethnically similar clusters and it does not require assumptions about population models. Nor does it estimate allele frequencies. Moreover, this software can also infer the optimal number of populations. Conclusion: Our software tool employs non-parametric approaches to assign individuals to clusters using SNPs. It provides efficient computation and an intuitive way for researchers to explore ethnic relationships among individuals. It can be complementary to parametric approaches in population structure analysis.
引用
收藏
页数:6
相关论文
共 35 条
[21]  
Liu Nianjun, 2006, Human Genomics, V2, P353
[22]   The effects of human population structure on large genetic association studies [J].
Marchini, J ;
Cardon, LR ;
Phillips, MS ;
Donnelly, P .
NATURE GENETICS, 2004, 36 (05) :512-517
[23]  
McKeigue PM, 2000, ANN HUM GENET, V64, P171, DOI 10.1017/S0003480000008022
[24]   Multilocus genotypes, a tree of individuals, and human evolutionary history [J].
Mountain, JL ;
CavalliSforza, LL .
AMERICAN JOURNAL OF HUMAN GENETICS, 1997, 61 (03) :705-718
[25]   Methods for high-density admixture mapping of disease genes [J].
Patterson, N ;
Hattangadi, N ;
Lane, B ;
Lohmueller, KE ;
Hafler, DA ;
Oksenberg, JR ;
Hauser, SL ;
Smith, MW ;
O'Brien, SJ ;
Altshuler, D ;
Daly, MJ ;
Reich, D .
AMERICAN JOURNAL OF HUMAN GENETICS, 2004, 74 (05) :979-1000
[26]   Principal components analysis corrects for stratification in genome-wide association studies [J].
Price, Alkes L. ;
Patterson, Nick J. ;
Plenge, Robert M. ;
Weinblatt, Michael E. ;
Shadick, Nancy A. ;
Reich, David .
NATURE GENETICS, 2006, 38 (08) :904-909
[27]   Association mapping in structured populations [J].
Pritchard, JK ;
Stephens, M ;
Rosenberg, NA ;
Donnelly, P .
AMERICAN JOURNAL OF HUMAN GENETICS, 2000, 67 (01) :170-181
[28]   Properties of structured association approaches to detecting population stratification [J].
Purcell, S ;
Sham, P .
HUMAN HEREDITY, 2004, 58 (02) :93-107
[29]   PLINK: A tool set for whole-genome association and population-based linkage analyses [J].
Purcell, Shaun ;
Neale, Benjamin ;
Todd-Brown, Kathe ;
Thomas, Lori ;
Ferreira, Manuel A. R. ;
Bender, David ;
Maller, Julian ;
Sklar, Pamela ;
de Bakker, Paul I. W. ;
Daly, Mark J. ;
Sham, Pak C. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2007, 81 (03) :559-575
[30]   Searching for genetic determinants in the new millennium [J].
Risch, NJ .
NATURE, 2000, 405 (6788) :847-856