CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure

被引:5174
作者
Jakobsson, Mattias [1 ]
Rosenberg, Noah A. [1 ]
机构
[1] Univ Michigan, Ctr Computat Med & Biol, Dept Human Genet, Ann Arbor, MI 48109 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btm233
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Clustering of individuals into populations on the basis of multilocus genotypes is informative in a variety of settings. In population-genetic clustering algorithms, such as BAPS, STRUCTURE and TESS, individual multilocus genotypes are partitioned over a set of clusters, often using unsupervised approaches that involve stochastic simulation. As a result, replicate cluster analyses of the same data may produce several distinct solutions for estimated cluster membership coefficients, even though the same initial conditions were used. Major differences among clustering solutions have two main sources: (1) 'label switching' of clusters across replicates, caused by the arbitrary way in which clusters in an unsupervised analysis are labeled, and (2) 'genuine multimodality,' truly distinct solutions across replicates. Results: To facilitate the interpretation of population-genetic clustering results, we describe three algorithms for aligning multiple replicate analyses of the same data set. We have implemented these algorithms in the computer program CLUMPP (CLUster Matching and Permutation Program). We illustrate the use of CLUMPP by aligning the cluster membership coefficients from 100 replicate cluster analyses of 600 chickens from 20 different breeds.
引用
收藏
页码:1801 / 1806
页数:6
相关论文
共 20 条
[1]  
Anderson EC, 2002, GENETICS, V160, P1217
[2]   Bayesian clustering algorithms ascertaining spatial population structure:: a new computer program and a comparison study [J].
Chen, Chibiao ;
Durand, Eric ;
Forbes, Florence ;
Francois, Olivier .
MOLECULAR ECOLOGY NOTES, 2007, 7 (05) :747-756
[3]   FASTRUCT: model-based clustering made faster [J].
Chen, Chibiao ;
Forbes, Florence ;
Francois, Olivier .
MOLECULAR ECOLOGY NOTES, 2006, 6 (04) :980-983
[4]   BAPS 2:: enhanced possibilities for the analysis of genetic population structure [J].
Corander, J ;
Waldmann, P ;
Marttinen, P ;
Sillanpää, MJ .
BIOINFORMATICS, 2004, 20 (15) :2363-2369
[5]  
Corander J, 2003, GENETICS, V163, P367
[6]   Bayesian identification of admixture events using multilocus molecular markers [J].
Corander, Jukka ;
Marttinen, Pekka .
MOLECULAR ECOLOGY, 2006, 15 (10) :2833-2843
[7]   A Bayesian approach to the identification of panmictic populations and the assignment of individuals [J].
Dawson, KJ ;
Belkhir, K .
GENETICAL RESEARCH, 2001, 78 (01) :59-77
[8]   Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study [J].
Evanno, G ;
Regnaut, S ;
Goudet, J .
MOLECULAR ECOLOGY, 2005, 14 (08) :2611-2620
[9]  
Falush D, 2003, GENETICS, V164, P1567
[10]  
François O, 2006, GENETICS, V174, P805, DOI 10.1534/genetics.106.059923