Quantifying Population Genetic Differentiation from Next-Generation Sequencing Data

被引:161
作者
Fumagalli, Matteo [1 ]
Vieira, Filipe G. [1 ]
Korneliussen, Thorfinn Sand [3 ,4 ]
Linderoth, Tyler [1 ]
Huerta-Sanchez, Emilia [1 ]
Albrechtsen, Anders [4 ]
Nielsen, Rasmus [1 ,2 ,4 ]
机构
[1] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[3] Nat Hist Museum Denmark, Ctr GeoGenet, DK-2100 Copenhagen, Denmark
[4] Univ Copenhagen, Dept Biol, DK-2200 Copenhagen, Denmark
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
next-generation sequencing; F-ST; principal components analysis; REVEALS; ASSOCIATION; INFERENCE; SELECTION; LOCI; FRAMEWORK; ANCESTRY; GENOTYPE; DOMINANT; MARKERS;
D O I
10.1534/genetics.113.154740
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naive methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.
引用
收藏
页码:979 / +
页数:37
相关论文
共 58 条
[1]   Ascertainment Biases in SNP Chips Affect Measures of Population Divergence [J].
Albrechtsen, Anders ;
Nielsen, Finn Cilius ;
Nielsen, Rasmus .
MOLECULAR BIOLOGY AND EVOLUTION, 2010, 27 (11) :2534-2547
[2]   A map of human genome variation from population-scale sequencing [J].
Altshuler, David ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Collins, Francis S. ;
De la Vega, Francisco M. ;
Donnelly, Peter ;
Egholm, Michael ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Knoppers, Bartha M. ;
Lander, Eric S. ;
Lehrach, Hans ;
Mardis, Elaine R. ;
McVean, Gil A. ;
Nickerson, DebbieA. ;
Peltonen, Leena ;
Schafer, Alan J. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Deiros, David ;
Metzker, Mike ;
Muzny, Donna ;
Reid, Jeff ;
Wheeler, David ;
Wang, Jun ;
Li, Jingxiang ;
Jian, Min ;
Li, Guoqing ;
Li, Ruiqiang ;
Liang, Huiqing ;
Tian, Geng ;
Wang, Bo ;
Wang, Jian ;
Wang, Wei ;
Yang, Huanming ;
Zhang, Xiuqing ;
Zheng, Huisong ;
Lander, Eric S. ;
Altshuler, David L. ;
Ambrogio, Lauren ;
Bloom, Toby ;
Cibulskis, Kristian ;
Fennell, Tim J. ;
Gabriel, Stacey B. .
NATURE, 2010, 467 (7319) :1061-1073
[3]  
[Anonymous], 2012, Nature
[4]  
[Anonymous], 1987, Unconstrained Optimization: Practical Methods of Optimization
[5]  
[Anonymous], J R STAT SOC B
[6]  
[Anonymous], 2007, NUMERICAL RECIPES
[7]   A Fine-Scale Chimpanzee Genetic Map from Population Sequencing [J].
Auton, Adam ;
Fledel-Alon, Adi ;
Pfeifer, Susanne ;
Venn, Oliver ;
Segurel, Laure ;
Street, Teresa ;
Leffler, Ellen M. ;
Bowden, Rory ;
Aneas, Ivy ;
Broxholme, John ;
Humburg, Peter ;
Iqbal, Zamin ;
Lunter, Gerton ;
Maller, Julian ;
Hernandez, Ryan D. ;
Melton, Cord ;
Venkat, Aarti ;
Nobrega, Marcelo A. ;
Bontrop, Ronald ;
Myers, Simon ;
Donnelly, Peter ;
Przeworski, Molly ;
McVean, Gil .
SCIENCE, 2012, 336 (6078) :193-198
[8]   Likelihood-based inference for genetic correlation coefficients [J].
Balding, DJ .
THEORETICAL POPULATION BIOLOGY, 2003, 63 (03) :221-230
[9]   A METHOD FOR QUANTIFYING DIFFERENTIATION BETWEEN POPULATIONS AT MULTI-ALLELIC LOCI AND ITS IMPLICATIONS FOR INVESTIGATING IDENTITY AND PATERNITY [J].
BALDING, DJ ;
NICHOLS, RA .
GENETICA, 1995, 96 (1-2) :3-12
[10]   Identifying adaptive genetic divergence among populations from genome scans [J].
Beaumont, MA ;
Balding, DJ .
MOLECULAR ECOLOGY, 2004, 13 (04) :969-980