Inferring Geographic Coordinates of Origin for Europeans Using Small Panels of Ancestry Informative Markers

被引:31
作者
Drineas, Petros [1 ]
Lewis, Jamey [1 ]
Paschou, Peristera [2 ]
机构
[1] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[2] Democritus Univ Thrace, Dept Mol Biol & Genet, Alexandroupolis, Greece
来源
PLOS ONE | 2010年 / 5卷 / 08期
基金
美国国家科学基金会;
关键词
GENETIC SUBSTRUCTURE; POPULATION-STRUCTURE; ADMIXTURE; DISEASE; GENOME; ASSOCIATION; STRATIFICATION; SELECTION; PATTERNS; LINKAGE;
D O I
10.1371/journal.pone.0011892
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent large-scale studies of European populations have demonstrated the existence of population genetic structure within Europe and the potential to accurately infer individual ancestry when information from hundreds of thousands of genetic markers is used. In fact, when genomewide genetic variation of European populations is projected down to a two-dimensional Principal Components Analysis plot, a surprising correlation with actual geographic coordinates of self-reported ancestry has been reported. This substructure can hamper the search of susceptibility genes for common complex disorders leading to spurious correlations. The identification of genetic markers that can correct for population stratification becomes therefore of paramount importance. Analyzing 1,200 individuals from 11 populations genotyped for more than 500,000 SNPs (Population Reference Sample), we present a systematic exploration of the extent to which geographic coordinates of origin within Europe can be predicted, with small panels of SNPs. Markers are selected to correlate with the top principal components of the dataset, as we have previously demonstrated. Performing thorough cross-validation experiments we show that it is indeed possible to predict individual ancestry within Europe down to a few hundred kilometers from actual individual origin, using information from carefully selected panels of 500 or 1,000 SNPs. Furthermore, we show that these panels can be used to correctly assign the HapMap Phase 3 European populations to their geographic origin. The SNPs that we propose can prove extremely useful in a variety of different settings, such as stratification correction or genetic ancestry testing, and the study of the history of European populations.
引用
收藏
页数:6
相关论文
共 27 条
[21]   Informativeness of genetic markers for inference of ancestry [J].
Rosenberg, NA ;
Li, LM ;
Ward, R ;
Pritchard, JK .
AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (06) :1402-1422
[22]   Genome-wide detection and characterization of positive selection in human populations [J].
Sabeti, Pardis C. ;
Varilly, Patrick ;
Fry, Ben ;
Lohmueller, Jason ;
Hostetter, Elizabeth ;
Cotsapas, Chris ;
Xie, Xiaohui ;
Byrne, Elizabeth H. ;
McCarroll, Steven A. ;
Gaudet, Rachelle ;
Schaffner, Stephen F. ;
Lander, Eric S. .
NATURE, 2007, 449 (7164) :913-U12
[23]   European population substructure: Clustering of northern and southern populations [J].
Seldin, Michael F. ;
Shigeta, Russell ;
Villoslada, Pablo ;
Selmi, Carlo ;
Tuomilehto, Jaakko ;
Silva, Gabriel ;
Belmont, John W. ;
Klareskog, Lars ;
Gregersen, Peter K. .
PLOS GENETICS, 2006, 2 (09) :1339-1351
[24]   Analysis and application of European genetic substructure using 300 KSNP information [J].
Tian, Chao ;
Plenge, Robert M. ;
Ransom, Michael ;
Lee, Annette ;
Villoslada, Pablo ;
Selmi, Carlo ;
Klareskog, Lars ;
Pulver, Ann E. ;
Qi, Lihong ;
Gregersen, Peter K. ;
Seldin, Michael F. .
PLOS GENETICS, 2008, 4 (01) :0029-0039
[25]   European Population Genetic Substructure: Further Definition of Ancestry Informative Markers for Distinguishing among Diverse European Ethnic Groups [J].
Tian, Chao ;
Kosoy, Roman ;
Nassir, Rami ;
Lee, Annette ;
Villoslada, Pablo ;
Klareskog, Lars ;
Hammarstrom, Lennart ;
Garchon, Henri-Jean ;
Pulver, Ann E. ;
Ransom, Michael ;
Gregersen, Peter K. ;
Seldin, Michael F. .
MOLECULAR MEDICINE, 2009, 15 (11-12) :371-383
[26]   A signal, from human mtDNA, of postglacial recolonization in Europe [J].
Torroni, A ;
Bandelt, HJ ;
Macaulay, V ;
Richards, M ;
Cruciani, F ;
Rengo, C ;
Martinez-Cabrera, V ;
Villems, R ;
Kivisild, T ;
Metspalu, E ;
Parik, JR ;
Tolk, HV ;
Tambets, K ;
Forster, P ;
Karger, B ;
Francalacci, P ;
Rudan, P ;
Janicijevic, B ;
Rickards, O ;
Savontaus, ML ;
Huoponen, K ;
Laitinen, V ;
Koivumäki, S ;
Sykes, B ;
Hickey, E ;
Novelletto, A ;
Moral, P ;
Sellitto, D ;
Coppa, A ;
Al-Zaheri, N ;
Santachiara-Benerecetti, AS ;
Semino, O ;
Scozzari, R .
AMERICAN JOURNAL OF HUMAN GENETICS, 2001, 69 (04) :844-852
[27]  
WRIGHT S, 1951, ANN EUGENIC, V15, P323