Inferring weak population structure with the assistance of sample group information

被引:2793
作者
Hubisz, Melissa J. [1 ,4 ]
Falush, Daniel [5 ]
Stephens, Matthew [1 ,2 ]
Pritchard, Jonathan K. [1 ,3 ]
机构
[1] Univ Chicago, Dept Human Genet, Chicago, IL 60637 USA
[2] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[3] Univ Chicago, Howard Hughes Med Inst, Chicago, IL 60637 USA
[4] Cornell Univ, Dept Biol Stat & Computat Biol, Ithaca, NY 14853 USA
[5] Univ Coll Cork, Dept Microbiol, Environm Res Inst, Cork, Ireland
基金
美国国家卫生研究院;
关键词
admixture; divergence; population structure; prior distribution; MULTILOCUS GENOTYPE DATA; DIFFERENTIATION; INFERENCE; IDENTIFICATION; LOCI;
D O I
10.1111/j.1755-0998.2009.02591.x
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genetic clustering algorithms require a certain amount of data to produce informative results. In the common situation that individuals are sampled at several locations, we show how sample group information can be used to achieve better results when the amount of data is limited. New models are developed for the STRUCTURE program, both for the cases of admixture and no admixture. These models work by modifying the prior distribution for each individual's population assignment. The new prior distributions allow the proportion of individuals assigned to a particular cluster to vary by location. The models are tested on simulated data, and illustrated using microsatellite data from the CEPH Human Genome Diversity Panel. We demonstrate that the new models allow structure to be detected at lower levels of divergence, or with less data, than the original STRUCTURE models or principal components methods, and that they are not biased towards detecting structure when it is not present. These models are implemented in a new version of STRUCTURE which is freely available online at http://pritch.bsd.uchicago.edu/structure.html.
引用
收藏
页码:1322 / 1332
页数:11
相关论文
共 17 条
[1]   A METHOD FOR QUANTIFYING DIFFERENTIATION BETWEEN POPULATIONS AT MULTI-ALLELIC LOCI AND ITS IMPLICATIONS FOR INVESTIGATING IDENTITY AND PATERNITY [J].
BALDING, DJ ;
NICHOLS, RA .
GENETICA, 1995, 96 (1-2) :3-12
[2]  
Corander J, 2003, GENETICS, V163, P367
[3]   Bayesian spatial modeling of genetic population structure [J].
Corander, Jukka ;
Siren, Jukka ;
Arjas, Elja .
COMPUTATIONAL STATISTICS, 2008, 23 (01) :111-129
[4]   Bayesian identification of admixture events using multilocus molecular markers [J].
Corander, Jukka ;
Marttinen, Pekka .
MOLECULAR ECOLOGY, 2006, 15 (10) :2833-2843
[5]   A Bayesian approach to the identification of panmictic populations and the assignment of individuals [J].
Dawson, KJ ;
Belkhir, K .
GENETICAL RESEARCH, 2001, 78 (01) :59-77
[6]   Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study [J].
Evanno, G ;
Regnaut, S ;
Goudet, J .
MOLECULAR ECOLOGY, 2005, 14 (08) :2611-2620
[7]  
Falush D, 2003, GENETICS, V164, P1567
[8]   Inference of population structure using multilocus genotype data: dominant markers and null alleles [J].
Falush, Daniel ;
Stephens, Matthew ;
Pritchard, Jonathan K. .
MOLECULAR ECOLOGY NOTES, 2007, 7 (04) :574-578
[9]  
François O, 2006, GENETICS, V174, P805, DOI 10.1534/genetics.106.059923
[10]   Analysing georeferenced population genetics data with Geneland: a new algorithm to deal with null alleles and a friendly graphical user interface [J].
Guillot, Gilles ;
Santos, Filipe ;
Estoup, Arnaud .
BIOINFORMATICS, 2008, 24 (11) :1406-1407