Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data

被引:119
作者
Nielsen, R
Hubisz, MJ
Clark, AG
机构
[1] Univ Copenhagen, Ctr Bioinformat, DK-2100 Copenhagen, Denmark
[2] Cornell Univ, Dept Biol Stat & Computat Biol, Ithaca, NY 14853 USA
[3] Cornell Univ, Dept Mol Biol & Genet, Ithaca, NY USA
关键词
D O I
10.1534/genetics.104.031039
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Most of the available SNP data have eluded valid population generic analysis because most population genetical methods do not correctly accommodate the special discovery process used to identify SNPS. Most of the available SNP data have allele frequency distributions dial are biased by die ascertainment protocol. We here show how this problem can be corrected by obtaining maximum-likelihood estimates of the true allele frequency distribution. In simple cases, the NIL estimate of the true allele frequency distribution can be obtained analytically, but in other cases computational methods based on numerical optimization or the EM algorithm must be used. We illustrate the new correction method by analyzing some previously published SNP data from the SNP Consortium. Appropriate treatment of SNP ascertainment is vital to our ability to make correct inferences from the data of die International HapMap Project.
引用
收藏
页码:2373 / 2382
页数:10
相关论文
共 21 条
  • [1] The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium
    Akey, JM
    Zhang, K
    Xiong, MM
    Jin, L
    [J]. MOLECULAR BIOLOGY AND EVOLUTION, 2003, 20 (02) : 232 - 242
  • [2] Interrogating a high-density SNP map for signatures of natural selection
    Akey, JM
    Zhang, G
    Zhang, K
    Jin, L
    Shriver, MD
    [J]. GENOME RESEARCH, 2002, 12 (12) : 1805 - 1814
  • [3] An SNP map of the human genome generated by reduced representation shotgun sequencing
    Altshuler, D
    Pollara, VJ
    Cowles, CR
    Van Etten, WJ
    Baldwin, J
    Linton, L
    Lander, ES
    [J]. NATURE, 2000, 407 (6803) : 513 - 516
  • [4] Casella G., 2021, STAT INFERENCE
  • [5] The application of molecular genetic approaches to the study of human evolution
    Cavalli-Sforza, LL
    Feldman, MW
    [J]. NATURE GENETICS, 2003, 33 : 266 - 275
  • [6] Hudson RR, 2001, GENETICS, V159, P1805
  • [7] Kingman JFC., 1982, Stochastic Processes and their Applications, V13, P235, DOI [10.1016/0304-4149(82)90011-4, DOI 10.1016/0304-4149(82)90011-4]
  • [8] Kuhner MK, 2000, GENETICS, V156, P439
  • [9] A 3.9-centimorgan-resolution human single-nucleotide polymorphism linkage map and screening set
    Matise, TC
    Sachidanandam, R
    Clark, AG
    Kruglyak, L
    Wijsman, E
    Kakol, J
    Buyske, S
    Chui, B
    Cohen, P
    de Toma, C
    Ehm, M
    Glanowski, S
    He, CS
    Heil, J
    Markianos, K
    McMullen, I
    Pericak-Vance, MA
    Silbergleit, A
    Stein, L
    Wagner, M
    Wilson, AF
    Winick, JD
    Winn-Deen, ES
    Yamashiro, CT
    Cann, HM
    Lai, E
    Holden, AL
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2003, 73 (02) : 271 - 284
  • [10] Correcting for ascertainment biases when analyzing SNP data: applications to the estimation of linkage disequilibrium
    Nielsen, R
    Signorovitch, J
    [J]. THEORETICAL POPULATION BIOLOGY, 2003, 63 (03) : 245 - 255