Ascertainment Biases in SNP Chips Affect Measures of Population Divergence

被引:251
作者
Albrechtsen, Anders [1 ]
Nielsen, Finn Cilius [2 ]
Nielsen, Rasmus [3 ,4 ]
机构
[1] Univ Copenhagen, Dept Biostat, Copenhagen, Denmark
[2] Rigshosp, Dept Clin Biochem, DK-2100 Copenhagen, Denmark
[3] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA 94720 USA
[4] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
基金
美国国家卫生研究院;
关键词
ascertainment bias; demography; single nucleotide polymorphisms; SNP chip data; population genetics; NUCLEOTIDE POLYMORPHISM DATA; HUMAN DEMOGRAPHIC HISTORY; GENOME-WIDE ASSOCIATION; LINKAGE DISEQUILIBRIUM; PARAMETERS; GENES; DISCOVERY; HAPLOTYPE; INFERENCE; GENOTYPE;
D O I
10.1093/molbev/msq148
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Chip-based high-throughput genotyping has facilitated genome-wide studies of genetic diversity. Many studies have utilized these large data sets to make inferences about the demographic history of human populations using measures of genetic differentiation such as F(ST) or principal component analyses. However, the single nucleotide polymorphism (SNP) chip data suffer from ascertainment biases caused by the SNP discovery process in which a small number of individuals from selected populations are used as discovery panels. In this study, we investigate the effect of the ascertainment bias on inferences regarding genetic differentiation among populations in one of the common genome-wide genotyping platforms. We generate SNP genotyping data for individuals that previously have been subject to partial genome-wide Sanger sequencing and compare inferences based on genotyping data to inferences based on direct sequencing. In addition, we also analyze publicly available genome-wide data. We demonstrate that the ascertainment biases will distort measures of human diversity and possibly change conclusions drawn from these measures in some times unexpected ways. We also show that details of the genotyping calling algorithms can have a surprisingly large effect on population genetic inferences. We not only present a correction of the spectrum for the widely used Affymetrix SNP chips but also show that such corrections are difficult to generalize among studies.
引用
收藏
页码:2534 / 2547
页数:14
相关论文
共 36 条
[1]  
Affymetrix, 2006, BRLMM IMPR GEN CALL
[2]   Relatedness Mapping and Tracts of Relatedness for Genome-Wide Data in the Presence of Linkage Disequilibrium [J].
Albrechtsen, Anders ;
Korneliussen, Thorfinn Sand ;
Moltke, Ida ;
Hansen, Thomas van Overseem ;
Nielsen, Finn Cilius ;
Nielsen, Rasmus .
GENETIC EPIDEMIOLOGY, 2009, 33 (03) :266-274
[3]   Assessing the evolutionary impact of amino acid mutations in the human genome [J].
Boyko, Adam R. ;
Williamson, Scott H. ;
Indap, Amit R. ;
Degenhardt, Jeremiah D. ;
Hernandez, Ryan D. ;
Lohmueller, Kirk E. ;
Adams, Mark D. ;
Schmidt, Steffen ;
Sninsky, John J. ;
Sunyaev, Shamil R. ;
White, Thomas J. ;
Nielsen, Rasmus ;
Clark, Andrew G. ;
Bustamante, Carlos D. .
PLOS GENETICS, 2008, 4 (05)
[4]   Natural selection on protein-coding genes in the human genome [J].
Bustamante, CD ;
Fledel-Alon, A ;
Williamson, S ;
Nielsen, R ;
Hubisz, MT ;
Glanowski, S ;
Tanenbaum, DM ;
White, TJ ;
Sninsky, JJ ;
Hernandez, RD ;
Civello, D ;
Adams, MD ;
Cargill, M ;
Clark, AG .
NATURE, 2005, 437 (7062) :1153-1157
[5]  
Cann HM, 2002, SCIENCE, V296, P261
[6]   Ascertainment bias in studies of human genome-wide polymorphism [J].
Clark, AG ;
Hubisz, MJ ;
Bustamante, CD ;
Williamson, SH ;
Nielsen, R .
GENOME RESEARCH, 2005, 15 (11) :1496-1502
[7]   Effects of ascertainment bias on recovering human demographic history [J].
Eller, E .
HUMAN BIOLOGY, 2001, 73 (03) :411-427
[8]   An approximate Bayesian computation approach to overcome biases that arise when using amplified fragment length polymorphism markers to study population structure [J].
Foll, Matthieu ;
Beaumont, Mark A. ;
Gaggiotti, Oscar .
GENETICS, 2008, 179 (02) :927-939
[9]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[10]   Correcting for ascertainment bias in the inference of population structure [J].
Guillot, Gilles ;
Foll, Matthieu .
BIOINFORMATICS, 2009, 25 (04) :552-554