Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences

被引:113
作者
Irizarry, K
Kustanovich, V
Li, C
Brown, N
Nelson, S
Wong, W
Lee, CJ
机构
[1] Univ Calif Los Angeles, Dept Chem & Biochem, Los Angeles, CA 90024 USA
[2] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90024 USA
[3] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90024 USA
[4] Univ Calif Los Angeles, Dept Pediat, Los Angeles, CA 90024 USA
[5] Univ Calif Los Angeles, Grad Program Comp Sci, Los Angeles, CA 90024 USA
关键词
D O I
10.1038/79981
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolution marker set for accelerating the mapping of disease genes(1-11). Here we report 48,196 candidate SNPs detected by statistical analysis of human expressed sequence tags (ESTs). associated primarily with coding regions of genes. We used Bayesian inference to weigh evidence for true polymorphism versus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA library origin. Three separate validation-comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing-verified 70%, 89% and 71% of our predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large fraction of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disease genes (available at http://www.bioinformatics.ucla.edu/snp).
引用
收藏
页码:233 / 236
页数:4
相关论文
共 22 条
  • [1] BAUR EW, 1965, HUMANGENETIK, V1, P621
  • [2] GENE DISCOVERY IN DBEST
    BOGUSKI, MS
    TOLSTOSHEV, CM
    BASSETT, DE
    [J]. SCIENCE, 1994, 265 (5181) : 1993 - 1994
  • [3] The essence of SNPs
    Brookes, AJ
    [J]. GENE, 1999, 234 (02) : 177 - 186
  • [4] Reliable identification of large numbers of candidate SNPs from public EST data
    Buetow, KH
    Edmonson, MN
    Cassidy, AB
    [J]. NATURE GENETICS, 1999, 21 (03) : 323 - 325
  • [5] Characterization of single-nucleotide polymorphisms in coding regions of human genes
    Cargill, M
    Altshuler, D
    Ireland, J
    Sklar, P
    Ardlie, K
    Patil, N
    Lane, CR
    Lim, EP
    Kalyanaraman, N
    Nemesh, J
    Ziaugra, L
    Friedland, L
    Rolfe, A
    Warrington, J
    Lipshutz, R
    Daley, GQ
    Lander, ES
    [J]. NATURE GENETICS, 1999, 22 (03) : 231 - 238
  • [6] Base-calling of automated sequencer traces using phred.: II.: Error probabilities
    Ewing, B
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 186 - 194
  • [7] Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment
    Ewing, B
    Hillier, L
    Wendl, MC
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 175 - 185
  • [8] Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis
    Halushka, MK
    Fan, JB
    Bentley, K
    Hsie, L
    Shen, NP
    Weder, A
    Cooper, R
    Lipshutz, R
    Chakravarti, A
    [J]. NATURE GENETICS, 1999, 22 (03) : 239 - 247
  • [10] Jackson AL, 1998, GENETICS, V148, P1483