The discovery of single-nucleotide polymorphisms - and inferences about human demographic history

被引:128
作者
Wakeley, J
Nielsen, R
Liu-Cordero, SN
Ardlie, K
机构
[1] Harvard Univ, Dept Organism & Evolutionary Biol, Cambridge, MA 02138 USA
[2] Whitehead Inst Biomed Res, Cambridge, MA 02142 USA
[3] MIT, Dept Biol, Cambridge, MA USA
关键词
D O I
10.1086/324521
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
A method of historical inference that accounts for ascertainment bias is developed and applied to single-nucleotide polymorphism (SNP) data in humans. The data consist of 84 short fragments of the genome that were selected, from three recent SNP surveys, to contain at least two polymorphisms in their respective ascertainment samples and that were then fully resequenced in 47 globally distributed individuals. Ascertainment bias is the deviation, from what would be observed in a random sample, caused either by discovery of polymorphisms in small samples or by locus selection based on levels or patterns of polymorphism. The three SNP surveys from which the present data were derived differ both in their protocols for ascertainment and in the size of the samples used for discovery. We implemented a Monte Carlo maximum-likelihood method to fit a subdivided-population model that includes a possible change in effective size at some time in the past. Incorrectly assuming that ascertainment bias does not exist causes errors in inference, affecting both estimates of migration rates and historical changes in size. Migration rates are overestimated when ascertainment bias is ignored. However, the direction of error in inferences about changes in effective population size (whether the population is inferred to be shrinking or growing) depends on whether either the numbers of SNPs per fragment or the SNP-allele frequencies are analyzed. We use the abbreviation "SDL," for "(S) under bar NP-(d) under bar iscovered locus," in recognition of the genomic-discovery context of SNPs. When ascertainment bias is modeled fully, both the number of SNPs per SDL and their allele frequencies support a scenario of growth in effective size in the context of a subdivided population. If subdivision is ignored, however, the hypothesis of constant effective population size cannot be rejected. An important conclusion of this work is that, in demographic or other studies, SNP data are useful only to the extent that their ascertainment can be modeled.
引用
收藏
页码:1332 / 1347
页数:16
相关论文
共 42 条
  • [1] An SNP map of the human genome generated by reduced representation shotgun sequencing
    Altshuler, D
    Pollara, VJ
    Cowles, CR
    Van Etten, WJ
    Baldwin, J
    Linton, L
    Lander, ES
    [J]. NATURE, 2000, 407 (6803) : 513 - 516
  • [2] Lower-than-expected linkage disequilibrium between tightly linked markers in humans suggests a role for gene conversion
    Ardlie, K
    Liu-Cordero, SN
    Eberle, MA
    Daly, M
    Barrett, J
    Winchester, E
    Lander, ES
    Kruglyak, L
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2001, 69 (03) : 582 - 589
  • [3] Arratia R., 1992, Ann. Appl. Probab., V2, P519, DOI [10.1214/aoap/1177005647, DOI 10.1214/AOAP/1177005647]
  • [4] Bowcock A M, 1987, Gene Geogr, V1, P47
  • [5] MITOCHONDRIAL-DNA AND HUMAN-EVOLUTION
    CANN, RL
    STONEKING, M
    WILSON, AC
    [J]. NATURE, 1987, 325 (6099) : 31 - 36
  • [6] Characterization of single-nucleotide polymorphisms in coding regions of human genes
    Cargill, M
    Altshuler, D
    Ireland, J
    Sklar, P
    Ardlie, K
    Patil, N
    Lane, CR
    Lim, EP
    Kalyanaraman, N
    Nemesh, J
    Ziaugra, L
    Friedland, L
    Rolfe, A
    Warrington, J
    Lipshutz, R
    Daley, GQ
    Lander, ES
    [J]. NATURE GENETICS, 1999, 22 (03) : 231 - 238
  • [7] EWENS WJ, 1972, THEOR POPUL BIOL, V3, P87, DOI 10.1016/0040-5809(72)90035-4
  • [8] ESTIMATION OF GENETIC-VARIATION AT THE DNA LEVEL FROM RESTRICTION ENDONUCLEASE DATA
    EWENS, WJ
    SPIELMAN, RS
    HARRIS, H
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1981, 78 (06): : 3748 - 3750
  • [9] FU XY, 1995, THEOR POPUL BIOL, V48, P172
  • [10] FU XY, 1994, GENETICS, V138, P1375