Automated identification of single nucleotide polymorphisms from sequencing data

被引:7
作者
Takahashi, M [1 ]
Matsuda, F [1 ]
Margetic, N [1 ]
Lathrop, M [1 ]
机构
[1] Ctr Natl Genotypage, F-91057 Evry, France
来源
CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE | 2002年
关键词
D O I
10.1109/CSB.2002.1039332
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Single nucleotide polymorphisms (SNPs) provide abundant information about genetic variation. Large scale discovery of high frequency SNPs is being undertaken using various methods. However, the publicly available SNP data are not always accurate, and therefore should be verified. If only a particular gene locus is concerned, locus-specific polymerase chain reaction amplification may be useful. Problem of this method is that the secondary peak has to be measured. We have analyzed trace data from conventional sequencing equipment and found an applicable rule to discern SNPs from noise. We have developed software that integrates this junction to automatically identify SNPs. The software works accurately for high quality sequences and also, can detect SNPs in low quality sequences. Further, it can determine allele frequency, display this information as a bar graph and assign corresponding nucleotide combinations. It is very useful for identifying de novo SNPs in a DNA fragment of interest.
引用
收藏
页码:87 / 93
页数:7
相关论文
共 11 条
[1]   An SNP map of the human genome generated by reduced representation shotgun sequencing [J].
Altshuler, D ;
Pollara, VJ ;
Cowles, CR ;
Van Etten, WJ ;
Baldwin, J ;
Linton, L ;
Lander, ES .
NATURE, 2000, 407 (6803) :513-516
[2]   GENERAL METHOD FOR ISOLATION OF HIGH MOLECULAR-WEIGHT DNA FROM EUKARYOTES [J].
BLIN, N ;
STAFFORD, DW .
NUCLEIC ACIDS RESEARCH, 1976, 3 (09) :2303-2308
[3]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194
[4]   Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment [J].
Ewing, B ;
Hillier, L ;
Wendl, MC ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :175-185
[5]  
Kendall M., 1991, CLASSICAL INFERENCE, V2
[6]   The new genomics: Global views of biology [J].
Lander, ES .
SCIENCE, 1996, 274 (5287) :536-539
[7]   DNA SEQUENCING WITH DYE-LABELED TERMINATORS AND T7 DNA-POLYMERASE - EFFECT OF DYES AND DNTPS ON INCORPORATION OF DYE-TERMINATORS AND PROBABILITY ANALYSIS OF TERMINATION FRAGMENTS [J].
LEE, LG ;
CONNELL, CR ;
WOO, SL ;
CHENG, RD ;
MCARDLE, BF ;
FULLER, CW ;
HALLORAN, ND ;
WILSON, RK .
NUCLEIC ACIDS RESEARCH, 1992, 20 (10) :2471-2483
[8]  
Parker LT, 1996, BIOTECHNIQUES, V21, P694
[9]   The future of genetic studies of complex human diseases [J].
Risch, N ;
Merikangas, K .
SCIENCE, 1996, 273 (5281) :1516-1517
[10]   DNA SEQUENCING WITH CHAIN-TERMINATING INHIBITORS [J].
SANGER, F ;
NICKLEN, S ;
COULSON, AR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1977, 74 (12) :5463-5467