BLAST plus : architecture and applications

被引:13975
作者
Camacho, Christiam [1 ]
Coulouris, George [1 ]
Avagyan, Vahram [1 ]
Ma, Ning [1 ]
Papadopoulos, Jason [1 ]
Bealer, Kevin [1 ]
Madden, Thomas L. [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
PSI-BLAST; PROTEIN-SEQUENCE; DNA-SEQUENCES; SEARCHES; TOOL;
D O I
10.1186/1471-2105-10-421
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications. Results: We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site. Conclusion: The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.
引用
收藏
页数:9
相关论文
共 24 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], BLAST+ Command Line Applications User Manual
[4]   A deterministic finite automaton for faster protein hit detection in BLAST [J].
Cameron, Michael ;
Williams, Hugh E. ;
Cannane, Adam .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (04) :965-978
[5]   NCBIBLAST: a better web interface [J].
Johnson, Mark ;
Zaretskaya, Irena ;
Raytselis, Yan ;
Merezhuk, Yuri ;
McGinnis, Scott ;
Madden, Thomas L. .
NUCLEIC ACIDS RESEARCH, 2008, 36 :W5-W9
[6]  
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202. Article published online before March 2002, 10.1101/gr.229202]
[7]   WindowMasker:: window-based masker for sequenced genomes [J].
Morgulis, A ;
Gertz, EM ;
Schäffer, AA ;
Agarwala, R .
BIOINFORMATICS, 2006, 22 (02) :134-141
[8]   Database indexing for production MegaBLAST searches [J].
Morgulis, Aleksandr ;
Coulouris, George ;
Raytselis, Yan ;
Madden, Thomas L. ;
Agarwala, Richa ;
Schaeffer, Alejandro A. .
BIOINFORMATICS, 2008, 24 (16) :1757-1764
[9]   A fast and symmetric DUST implementation to mask low-complexity DNA sequences [J].
Morgulis, Aleksandr ;
Gertz, E. Michael ;
Schaffer, Alejandro A. ;
Agarwala, Richa .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (05) :1028-1040
[10]  
*NCBI, NCBI C TOOLK DOC