PowerBLAST: A new network BLAST application for interactive or automated sequence analysis and annotation

被引:271
作者
Zhang, JH
Madden, TL
机构
[1] Natl. Ctr. for Biotech. Information, National Library of Medicine, National Institutes of Health, Bethesda
来源
GENOME RESEARCH | 1997年 / 7卷 / 06期
关键词
D O I
10.1101/gr.7.6.649
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
As the rate of DNA sequencing increases, analysis by sequence similarity search will need to become much more efficient in terms of sensitivity, specificity, automation potential, and consistency in annotation. PowerBLAST was developed, in part, to address these problems. PowerBLAST includes a number of options for masking repetitive elements and low complexity subsequences. It also has the capacity to restrict the search to any level of NCBI's taxonomy index, thus supporting ''comparative genomics'' applications. Postprocessing of the BLAST output using the SIM series of algorithms produces optimal, gapped alignments, and multiple alignments when a region of the query sequence matches multiple database sequences. PowerBLAST is capable of processing sequences of any length because it divides long query sequences into overlapping fragments and then merges the results after searching. The results may be viewed graphically, as a textual representation, or as an HTML page with links to GenBank and Entrez. For matching database sequences, annotated features are superimposed on the aligned query sequence in the output, thus greatly increasing the ease of interpretation. Such features may be used for automated annotation of new sequence because PowerBLAST output in ASN.1 form may be ''dragged and dropped'' into NCBI's Sequin program for sequence annotation and submission. PowerBLAST is capable of analyzing and annotating a 100-kb query in 60 min on NCBI's BLAST server.
引用
收藏
页码:649 / 656
页数:8
相关论文
共 14 条
  • [1] ISSUES IN SEARCHING MOLECULAR SEQUENCE DATABASES
    ALTSCHUL, SF
    BOGUSKI, MS
    GISH, W
    WOOTTON, JC
    [J]. NATURE GENETICS, 1994, 6 (02) : 119 - 129
  • [2] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [3] Positional cloning of the gene for multiple endocrine neoplasia-type 1
    Chandrasekharappa, SC
    Guru, SC
    Manickam, P
    Olufemi, SE
    Collins, FS
    EmmertBuck, MR
    Debelenko, LV
    Zhuang, ZP
    Lubensky, IA
    Liotta, LA
    Crabtree, JS
    Wang, YP
    Roe, BA
    Weisemann, J
    Boguski, MS
    Agarwal, SK
    Kester, MB
    Kim, YS
    Heppner, C
    Dong, QH
    Spiegel, AM
    Burns, AL
    Marx, SJ
    [J]. SCIENCE, 1997, 276 (5311) : 404 - 407
  • [4] Chao KM, 1997, COMPUT APPL BIOSCI, V13, P75
  • [5] CHAO KM, 1995, COMPUT APPL BIOSCI, V11, P147
  • [6] INFORMATION ENHANCEMENT METHODS FOR LARGE-SCALE SEQUENCE-ANALYSIS
    CLAVERIE, JM
    STATES, DJ
    [J]. COMPUTERS & CHEMISTRY, 1993, 17 (02): : 191 - 201
  • [7] HUANG XQ, 1990, COMPUT APPL BIOSCI, V6, P373
  • [8] Madden TL, 1996, METHOD ENZYMOL, V266, P131
  • [9] Schuler GD, 1996, METHOD ENZYMOL, V266, P141
  • [10] The origin of interspersed repeats in the human genome
    Smit, AFA
    [J]. CURRENT OPINION IN GENETICS & DEVELOPMENT, 1996, 6 (06) : 743 - 748