ZTR: a new format for DNA sequence trace data

被引:16
作者
Bonfield, JK [1 ]
Staden, R [1 ]
机构
[1] MRC, Mol Biol Lab, Cambridge CB2 2QH, England
基金
英国医学研究理事会;
关键词
D O I
10.1093/bioinformatics/18.1.3
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: To produce an open and extensible file format for DNA trace data which produces compact files suitable for large-scale storage and efficient use of internet bandwidth. Results: We have created an extensible format named ZTR. For a set of data taken from an ABI-3700 the ZTR format produces trace files which require 61.6% of the disk space used by gzipped SCFv3, and which can be written and read at greater speed. The compression algorithms used for the trace amplitudes are used within the National Center for Biotechnology Information (NCBI) trace archive.
引用
收藏
页码:3 / 10
页数:8
相关论文
共 17 条
  • [1] [Anonymous], COMP METHODS MOL BIO
  • [2] THE APPLICATION OF NUMERICAL ESTIMATES OF BASE CALLING ACCURACY TO DNA-SEQUENCING PROJECTS
    BONFIELD, JK
    STADEN, R
    [J]. NUCLEIC ACIDS RESEARCH, 1995, 23 (08) : 1406 - 1410
  • [3] Trev: a DNA trace editor and viewer
    Bonfield, JK
    Beal, KF
    Betts, MJ
    Staden, R
    [J]. BIOINFORMATICS, 2002, 18 (01) : 194 - 195
  • [4] Boutell T, 1997, RFC, DOI DOI 10.17487/RFC2083
  • [5] Burrows M., 1994, BLOCK SORTING LOSSLE, DOI 10.1.1.37.6774
  • [6] Unbounded length contexts for PPM
    Cleary, JG
    Teahan, WJ
    [J]. COMPUTER JOURNAL, 1997, 40 (2-3) : 67 - 75
  • [7] CORNISHBOWDEN A, 1985, EUR J BIOCHEM, V150, P1
  • [8] Sequence assembly with CAFTOOLS
    Dear, S
    Durbin, R
    Hillier, L
    Marth, G
    Thierry-Mieg, J
    Mott, R
    [J]. GENOME RESEARCH, 1998, 8 (03): : 260 - 267
  • [9] Dear S, 1992, DNA Seq, V3, P107, DOI 10.3109/10425179209034003
  • [10] Deutsch P., 1996, 1950 RFC