A graph theoretic approach to the analysis of DNA sequencing data

被引:35
作者
Berno, AJ
机构
[1] Stanford DNA Sequence/Technol. Ctr., Department of Biochemistry B403, Stanford Univ. School of Medicine, Stanford
来源
GENOME RESEARCH | 1996年 / 6卷 / 02期
关键词
D O I
10.1101/gr.6.2.80
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The analysis of data from automated DNA sequencing instruments has been a limiting factor in the development of new sequencing technology. A new base-calling algorithm that is intended to be independent of any particular sequencing technology has been developed and shown to be effective with data h-om the Applied Biosystems 373 sequencing system. This algorithm makes use of a nonlinear deconvolution filter to detect likely oligomer events and a graph theoretic editing strategy to find the subset of those events that is most likely to correspond to the correct sequence. Metrics evaluating the quality and accuracy of the resulting sequence are also generated and have been shown to be predictive of measured error rates. Compared to the Applied Biosystems Analysis software, this algorithm generates 18% fewer insertion errors, 80% more deletion errors, and 4% Fewer mismatches. The tradeoff between different types of errors can be controlled through a secondary editing step that inserts or deletes base calls depending on their associated confidence values.
引用
收藏
页码:80 / 91
页数:12
相关论文
共 8 条
  • [1] CORMEN TH, 1992, INTRO ALGORITHMS, P514
  • [2] AN ADAPTIVE, OBJECT-ORIENTED STRATEGY FOR BASE CALLING IN DNA-SEQUENCE ANALYSIS
    GIDDINGS, MC
    BRUMLEY, RL
    HAKER, M
    SMITH, LM
    [J]. NUCLEIC ACIDS RESEARCH, 1993, 21 (19) : 4530 - 4540
  • [3] GOLDEN JB, 1993, 1 INT C INT SYST MOL
  • [4] LARGE-SCALE AND AUTOMATED DNA-SEQUENCE DETERMINATION
    HUNKAPILLER, T
    KAISER, RJ
    KOOP, BF
    HOOD, L
    [J]. SCIENCE, 1991, 254 (5028) : 59 - 67
  • [5] Masters T., 1993, PRACTICAL NEURAL NET
  • [6] A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS
    NEEDLEMAN, SB
    WUNSCH, CD
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) : 443 - +
  • [7] PRESS WH, 1992, NUMERICAL RECIPES C, P36
  • [8] TIBBETTS C, 1993, AUTOMATED DNA SEQUEN