A graph theoretic approach to the analysis of DNA sequencing data
被引:35
作者:
Berno, AJ
论文数: 0引用数: 0
h-index: 0
机构:Stanford DNA Sequence/Technol. Ctr., Department of Biochemistry B403, Stanford Univ. School of Medicine, Stanford
Berno, AJ
机构:
[1] Stanford DNA Sequence/Technol. Ctr., Department of Biochemistry B403, Stanford Univ. School of Medicine, Stanford
来源:
GENOME RESEARCH
|
1996年
/
6卷
/
02期
关键词:
D O I:
10.1101/gr.6.2.80
中图分类号:
Q5 [生物化学];
Q7 [分子生物学];
学科分类号:
071010 ;
081704 ;
摘要:
The analysis of data from automated DNA sequencing instruments has been a limiting factor in the development of new sequencing technology. A new base-calling algorithm that is intended to be independent of any particular sequencing technology has been developed and shown to be effective with data h-om the Applied Biosystems 373 sequencing system. This algorithm makes use of a nonlinear deconvolution filter to detect likely oligomer events and a graph theoretic editing strategy to find the subset of those events that is most likely to correspond to the correct sequence. Metrics evaluating the quality and accuracy of the resulting sequence are also generated and have been shown to be predictive of measured error rates. Compared to the Applied Biosystems Analysis software, this algorithm generates 18% fewer insertion errors, 80% more deletion errors, and 4% Fewer mismatches. The tradeoff between different types of errors can be controlled through a secondary editing step that inserts or deletes base calls depending on their associated confidence values.