Bayesian variable selection for detecting adaptive genomic differences among populations

被引:55
作者
Riebler, Andrea [1 ]
Held, Leonhard [1 ]
Stephan, Wolfgang [2 ]
机构
[1] Univ Zurich, Inst Social & Prevent Med, Biostat Unit, CH-8001 Zurich, Switzerland
[2] Univ Munich, Dept Biol 2, Sect Evolut Biol, D-82152 Planegg Martinsried, Germany
关键词
D O I
10.1534/genetics.107.081281
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
We extend an F-st-based Bayesian hierarchical model, implemented via Markov chain Monte Carlo, for the detection of loci that might be subject to positive selection. This model divides the F-st-influencing factors into locus-specific effects, population-specific effects, and effects that are specific for the locus in combination with the population. We introduce a Bayesian auxiliary variable for each locus effect to automatically select nonneutral locus effects. As a by-product, the efficiency of the original approach is improved by using a reparameterization of the model. The statistical power of the extended algorithm is assessed with simulated data sets from a Wright-Fisher model with migration. We find that the inclusion of model selection suggests a clear improvement in discrimination as measured by the area under the receiver operating characteristic (ROC) curve. Additionally, we illustrate and discuss the quality of the newly developed method on the basis of an allozyme data set of the fruit fly Drosophila melanogaster and a sequence data set of the wild tomato Solanum chilense. For data sets with small sample sizes, high mutation rates, and/or long sequences, however, methods based on nucleotide statistics should be preferred.
引用
收藏
页码:1817 / 1829
页数:13
相关论文
共 28 条
[1]  
[Anonymous], 1992, Statistical science, DOI [10.1214/ss/1177011137, DOI 10.1214/SS/1177011137]
[2]   Using multilocus sequence data to assess population structure, natural selection, and linkage disequilibrium in wild tomatoes [J].
Arunyawat, Uraiwan ;
Stephan, Wolfgang ;
Staedler, Thomas .
MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (10) :2310-2322
[3]   Likelihood-based inference for genetic correlation coefficients [J].
Balding, DJ .
THEORETICAL POPULATION BIOLOGY, 2003, 63 (03) :221-230
[4]   A METHOD FOR QUANTIFYING DIFFERENTIATION BETWEEN POPULATIONS AT MULTI-ALLELIC LOCI AND ITS IMPLICATIONS FOR INVESTIGATING IDENTITY AND PATERNITY [J].
BALDING, DJ ;
NICHOLS, RA .
GENETICA, 1995, 96 (1-2) :3-12
[5]   Identifying adaptive genetic divergence among populations from genome scans [J].
Beaumont, MA ;
Balding, DJ .
MOLECULAR ECOLOGY, 2004, 13 (04) :969-980
[6]   Evaluating loci for use in the genetic analysis of population structure [J].
Beaumont, MA ;
Nichols, RA .
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 1996, 263 (1377) :1619-1626
[7]  
Bernardo J., 2009, Bayesian theory
[8]   BAYESIAN COMPUTATION AND STOCHASTIC-SYSTEMS [J].
BESAG, J ;
GREEN, P ;
HIGDON, D ;
MENGERSEN, K .
STATISTICAL SCIENCE, 1995, 10 (01) :3-41
[9]   Explorative genome scan to detect candidate loci for adaptation along a gradient of altitude in the common frog (Rana temporaria) [J].
Bonin, A ;
Taberlet, P ;
Miaud, C ;
Pompanon, F .
MOLECULAR BIOLOGY AND EVOLUTION, 2006, 23 (04) :773-783
[10]   On Bayesian model and variable selection using MCMC [J].
Dellaportas, P ;
Forster, JJ ;
Ntzoufras, I .
STATISTICS AND COMPUTING, 2002, 12 (01) :27-36