PipeOnline 2.0: automated EST processing and functional data sorting

被引:42
作者
Ayoubi, P
Jin, XJ
Leite, S
Liu, XH
Martajaja, J
Abduraham, A
Wan, QL
Yan, W
Misawa, E
Prade, RA [1 ]
机构
[1] Oklahoma State Univ, Dept Microbiol & Mol Genet, Stillwater, OK 74078 USA
[2] Oklahoma State Univ, Sch Mech & Aerosp Engn, Stillwater, OK 74078 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/nar/gkf585
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Expressed sequence tags (ESTs) are generated and deposited in the public domain, as redundant, un-annotated, single-pass reactions, with virtually no biological content. PipeOnline automatically analyses and transforms large collections of raw DNA-sequence data from chromatograms or FASTA files by calling the quality of bases, screening and removing vector sequences, assembling and rewriting consensus sequences of redundant input files into a unigene EST data set and finally through translation, amino acid sequence similarity searches, annotation of public databases and functional data. PipeOnline generates an annota ted database, retaining the processed unigene sequence, clone/file history, alignments with similar sequences, and proposed functional classification, if available. Functional annotation is automatic and based on a novel method that relies on homology of amino acid sequence multiplicity within GenBank records. Records are examined through a function ordered browser or keyword queries with automated export of results. PipeOnline offers customization for individual projects (MyPipeOnline), automated updating and alert service. PipeOnline is available at http://stress-genomics.org.
引用
收藏
页码:4761 / 4769
页数:9
相关论文
共 26 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [3] Automated genome sequence analysis and annotation
    Andrade, MA
    Brown, NP
    Leroy, C
    Hoersch, S
    de Daruvar, A
    Reich, C
    Franchini, A
    Tamames, J
    Valencia, A
    Ouzounis, C
    Sander, C
    [J]. BIOINFORMATICS, 1999, 15 (05) : 391 - 412
  • [4] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [5] The significance of digital gene expression profiles
    Audic, S
    Claverie, JM
    [J]. GENOME RESEARCH, 1997, 7 (10): : 986 - 995
  • [6] The ENZYME data bank in 1999
    Bairoch, A
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (01) : 310 - 311
  • [7] STACK: Sequence Tag Alignment and Consensus Knowledgebase
    Christoffels, A
    van Gelder, A
    Greyling, G
    Miller, R
    Hide, T
    Hide, W
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 234 - 238
  • [8] Claverie JM, 1996, METHOD ENZYMOL, V266, P212
  • [9] Base-calling of automated sequencer traces using phred.: II.: Error probabilities
    Ewing, B
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 186 - 194
  • [10] EWING RM, 2000, PAC S BIOCOMPUT, V5, P430