Population structure and eigenanalysis

被引:3575
作者
Patterson, Nick [1 ]
Price, Alkes L.
Reich, David
机构
[1] Broad Inst Harvard & MIT, Cambridge, MA USA
[2] Harvard Univ, Sch Med, Dept Genet, Boston, MA USA
来源
PLOS GENETICS | 2006年 / 2卷 / 12期
基金
英国惠康基金;
关键词
D O I
10.1371/journal.pgen.0020190
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure ( principal components analysis) that was first applied to genetic data by Cavalli-Sforza and colleagues. We place the method on a solid statistical footing, using results from modern statistics to develop formal significance tests. We also uncover a general "phase change'' phenomenon about the ability to detect structure in genetic data, which emerges from the statistical theory we use, and has an important implication for the ability to discover structure in genetic data: for a fixed but large dataset size, divergence between two populations (as measured, for example, by a statistic like F-ST) below a threshold is essentially undetectable, but a little above threshold, detection will be easy. This means that we can predict the dataset size needed to detect structure.
引用
收藏
页码:2074 / 2093
页数:20
相关论文
共 46 条
  • [1] Informative missingness in genetic association studies: Case-parent designs
    Allen, AS
    Rathouz, PJ
    Satten, GA
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 72 (03) : 671 - 680
  • [2] A haplotype map of the human genome
    Altshuler, D
    Brooks, LD
    Chakravarti, A
    Collins, FS
    Daly, MJ
    Donnelly, P
    Gibbs, RA
    Belmont, JW
    Boudreau, A
    Leal, SM
    Hardenbol, P
    Pasternak, S
    Wheeler, DA
    Willis, TD
    Yu, FL
    Yang, HM
    Zeng, CQ
    Gao, Y
    Hu, HR
    Hu, WT
    Li, CH
    Lin, W
    Liu, SQ
    Pan, H
    Tang, XL
    Wang, J
    Wang, W
    Yu, J
    Zhang, B
    Zhang, QR
    Zhao, HB
    Zhao, H
    Zhou, J
    Gabriel, SB
    Barry, R
    Blumenstiel, B
    Camargo, A
    Defelice, M
    Faggart, M
    Goyette, M
    Gupta, S
    Moore, J
    Nguyen, H
    Onofrio, RC
    Parkin, M
    Roy, J
    Stahl, E
    Winchester, E
    Ziaugra, L
    Shen, Y
    [J]. NATURE, 2005, 437 (7063) : 1299 - 1320
  • [3] Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices
    Baik, J
    Ben Arous, G
    Péché, S
    [J]. ANNALS OF PROBABILITY, 2005, 33 (05) : 1643 - 1697
  • [4] Eigenvalues of large sample covariance matrices of spiked population models
    Baik, Jinho
    Silverstein, Jack W.
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2006, 97 (06) : 1382 - 1408
  • [5] A METHOD FOR QUANTIFYING DIFFERENTIATION BETWEEN POPULATIONS AT MULTI-ALLELIC LOCI AND ITS IMPLICATIONS FOR INVESTIGATING IDENTITY AND PATERNITY
    BALDING, DJ
    NICHOLS, RA
    [J]. GENETICA, 1995, 96 (1-2) : 3 - 12
  • [6] HIGH-RESOLUTION OF HUMAN EVOLUTIONARY TREES WITH POLYMORPHIC MICROSATELLITES
    BOWCOCK, AM
    RUIZLINARES, A
    TOMFOHRDE, J
    MINCH, E
    KIDD, JR
    CAVALLISFORZA, LL
    [J]. NATURE, 1994, 368 (6470) : 455 - 457
  • [7] Population structure in the Mediterranean basin:: A Y chromosome perspective
    Capelli, C
    Redhead, N
    Romano, V
    Calì, F
    Lefranc, G
    Delague, V
    Megarbane, A
    Felice, AE
    Pascali, VL
    Neophytou, PI
    Poulli, Z
    Novelletto, A
    Malaspina, P
    Terrenato, L
    Berebbi, A
    Fellous, M
    Thomas, MG
    Goldstein, DB
    [J]. ANNALS OF HUMAN GENETICS, 2006, 70 : 207 - 225
  • [8] Cavalli-Sforza L. L., 1994, HIST GEOGRAPHY HUMAN
  • [9] The application of molecular genetic approaches to the study of human evolution
    Cavalli-Sforza, LL
    Feldman, MW
    [J]. NATURE GENETICS, 2003, 33 : 266 - 275
  • [10] CHAKRABORTY R, 1993, EXS, V67, P153