A software system for data analysis in automated DNA sequencing

被引:47
作者
Giddings, MC [1 ]
Severin, J [1 ]
Westphall, M [1 ]
Wu, JZ [1 ]
Smith, LM [1 ]
机构
[1] Univ Wisconsin, Dept Chem, Madison, WI 53706 USA
关键词
D O I
10.1101/gr.8.6.644
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Software for gel image analysis and base-calling in fluorescence-based sequencing consisting of two primary programs, BaseFinder and Gellmager; is described. BaseFinder is a framework for trace processing, analysis, and base-calling, BaseFinder is highly extensible, allowing the addition of trace analysis and processing modules without recompilation. Powerful scripting capabilities combined with modularity and multilane handling allow the user to customize BaseFinder to virtually any type of trace processing. We have developed an extensive set of data processing and analysis modules for use with the program in fluorescence-based sequencing Gellmager is a framework for gel image manipulation. It can be used for gel visualization, lane retracking, and as a front end to the Washington University Getlanes program. The programs were designed using a cross-platform development environment, currently allowing them to run in Windows NT, Windows 95, Openstep/Mach, and Rhapsody. Work is ongoing to deploy the software on additional platforms, including Solaris, Linux, and MacOS. This software has been thoroughly tested and debugged in the analysis of >2 million bp of raw sequence data from human chromosome 19 region q13. Overall sequencing accuracy was measured using a significant subset of these data, consisting of similar to 600 sequences, by comparing the individual shotgun sequences against the final assembled contigs. Also, results are reported from experiments that analyzed the accuracy of the software and two other well-known base-calling programs For sequencing the M13mp18 vector sequence.
引用
收藏
页码:644 / 665
页数:22
相关论文
共 30 条
[1]   DECONVOLUTION OF GEL-FILTRATION CHROMATOGRAPHS OF HUMAN PLASMA-LIPOPROTEINS [J].
BARBEE, KA ;
MORROW, JA ;
MEREDITH, SC .
ANALYTICAL BIOCHEMISTRY, 1995, 231 (02) :301-308
[2]   A graph theoretic approach to the analysis of DNA sequencing data [J].
Berno, AJ .
GENOME RESEARCH, 1996, 6 (02) :80-91
[3]   Rapid DNA sequencing of more than 1000 bases per run by capillary electrophoresis using replaceable linear polyacrylamide solutions [J].
Carrilho, E ;
RuizMartinez, MC ;
Berka, J ;
Smirnov, I ;
Goetzinger, W ;
Miller, AW ;
Brady, D ;
Karger, BL .
ANALYTICAL CHEMISTRY, 1996, 68 (19) :3305-3313
[4]   Lane tracking software for four-color fluorescence-based electrophoretic gel images [J].
Cooper, ML ;
Maffitt, DR ;
Parsons, JD ;
Hillier, L ;
States, DJ .
GENOME RESEARCH, 1996, 6 (11) :1110-1117
[5]  
Dear S, 1992, DNA Seq, V3, P107, DOI 10.3109/10425179209034003
[6]   Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment [J].
Ewing, B ;
Hillier, L ;
Wendl, MC ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :175-185
[7]   WHOLE-GENOME RANDOM SEQUENCING AND ASSEMBLY OF HAEMOPHILUS-INFLUENZAE RD [J].
FLEISCHMANN, RD ;
ADAMS, MD ;
WHITE, O ;
CLAYTON, RA ;
KIRKNESS, EF ;
KERLAVAGE, AR ;
BULT, CJ ;
TOMB, JF ;
DOUGHERTY, BA ;
MERRICK, JM ;
MCKENNEY, K ;
SUTTON, G ;
FITZHUGH, W ;
FIELDS, C ;
GOCAYNE, JD ;
SCOTT, J ;
SHIRLEY, R ;
LIU, LI ;
GLODEK, A ;
KELLEY, JM ;
WEIDMAN, JF ;
PHILLIPS, CA ;
SPRIGGS, T ;
HEDBLOM, E ;
COTTON, MD ;
UTTERBACK, TR ;
HANNA, MC ;
NGUYEN, DT ;
SAUDEK, DM ;
BRANDON, RC ;
FINE, LD ;
FRITCHMAN, JL ;
FUHRMANN, JL ;
GEOGHAGEN, NSM ;
GNEHM, CL ;
MCDONALD, LA ;
SMALL, KV ;
FRASER, CM ;
SMITH, HO ;
VENTER, JC .
SCIENCE, 1995, 269 (5223) :496-512
[8]   SELECTION OF ANALYTICAL WAVELENGTHS FOR MULTICOMPONENT SPECTROPHOTOMETRIC DETERMINATIONS [J].
FRANS, SD ;
HARRIS, JM .
ANALYTICAL CHEMISTRY, 1985, 57 (13) :2680-2684
[9]   HIGH-THROUGHPUT DNA PREPARATION SYSTEM [J].
GARNER, HR ;
ARMSTRONG, B ;
KRAMARSKY, DA .
GENETIC ANALYSIS-BIOMOLECULAR ENGINEERING, 1992, 9 (5-6) :134-139
[10]  
*GEN COMP GROUP, 1984, PROGR MAN WISC PACK