A Bayesian approach to the identification of panmictic populations and the assignment of individuals

被引:175
作者
Dawson, KJ
Belkhir, K
机构
[1] Univ Montpellier 2, Lab Genome Populat & Interact, CNRS, UMR 5000, F-34095 Montpellier 5, France
[2] Univ Bristol, Dept Agr Sci, IACR, Long Ashton Res Stn, Bristol BS41 9AF, Avon, England
关键词
D O I
10.1017/S001667230100502X
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We present likelihood-based methods for assigning the individuals in a sample to source populations, on the basis of their genotypes at co-dominant marker loci. The source populations are assumed to be at Hardy-Weinberg and linkage equilibrium, but the allelic composition of these source populations and even the number of source populations represented in the sample are treated as uncertain. The parameter of interest is the partition of the set of sampled individuals, induced by the assignment of individuals to source populations. We present a maximum likelihood method, and then a more powerful Bayesian approach for estimating this sample partition. In general, it will not be feasible to evaluate the evidence supporting each possible partition of the sample. Furthermore, when the number of individuals in the sample is large, it may not even be feasible to evaluate the evidence supporting, individually, each of the most plausible partitions because there may be many individuals which are difficult to assign. To overcome these problems, we use low-dimensional marginals (the 'co-assignment probabilities') of the posterior distribution of the sample partition as measures of 'similarity', and then apply a hierarchical clustering algorithm to identify clusters of individuals whose assignment together is well supported by the posterior distribution, A binary tree provides a visual representation of how well the posterior distribution supports each cluster in the hierarchy. These methods are applicable to other problems where the parameter of interest is a partition of a set. Because the co-assignment probabilities are independent of the arbitrary labelling of source populations, we avoid the label-switching problem of previous Bayesian methods.
引用
收藏
页码:59 / 77
页数:19
相关论文
共 44 条
[1]   Measuring departures from Hardy-Weinberg: a Markov chain Monte Carlo method for estimating the inbreeding coefficient [J].
Ayres, KL ;
Balding, DJ .
HEREDITY, 1998, 80 (6) :769-777
[2]   ADAPTATION, SPECIATION AND HYBRID ZONES [J].
BARTON, NH ;
HEWITT, GM .
NATURE, 1989, 341 (6242) :497-503
[3]  
BARTON NH, 1979, HEREDITY, V43, P333, DOI 10.1038/hdy.1979.86
[4]   Estimating multilocus linkage disequilibria [J].
Barton, NH .
HEREDITY, 2000, 84 (03) :373-389
[5]  
BELKHIR K, 2001, UNPUB BIOINFORMATICS
[6]  
Berge C., 1971, Principles of Combinatorics
[7]   BAYESIAN COMPUTATION AND STOCHASTIC-SYSTEMS [J].
BESAG, J ;
GREEN, P ;
HIGDON, D ;
MENGERSEN, K .
STATISTICAL SCIENCE, 1995, 10 (01) :3-41
[8]  
Burman P, 1995, BIOMETRIKA, V82, P877
[9]  
Cornuet JM, 1999, GENETICS, V153, P1989
[10]   EFFICIENT ALGORITHM FOR A COMPLETE LINK METHOD [J].
DEFAYS, D .
COMPUTER JOURNAL, 1977, 20 (04) :364-366