NBLAST: a cluster variant of BLAST for NxN comparisons

被引:16
作者
Dumontier, M
Hogue, CWV [1 ]
机构
[1] Univ Toronto, Dept Biochem, Toronto, ON M5S 1A8, Canada
[2] Mt Sinai Hosp, Samuel Lunenfeld Res Inst, Toronto, ON M5G 1X5, Canada
关键词
D O I
10.1186/1471-2105-3-13
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The BLAST algorithm compares biological sequences to one another in order to determine shared motifs and common ancestry. However, the comparison of all non-redundant (NR) sequences against all other NR sequences is a computationally intensive task. We developed NBLAST as a cluster computer implementation of the BLAST family of sequence comparison programs for the purpose of generating pre-computed BLAST alignments and neighbour lists of NR sequences. Results: NBLAST performs the heuristic BLAST algorithm and generates an exhaustive database of alignments, but it only computes N(N - 1)/2 alignments (i.e. the upper triangle) of a possible N-2 alignments, where N is the set of all sequences to be compared. A task-partitioning algorithm allows for cluster computing across all cluster nodes and the NBLAST master process produces a BLAST sequence alignment database and a list of sequence neighbours for each sequence record. The resulting sequence alignment and neighbour databases are used to serve the SeqHound query system through a C/C++ and PERIL Application Programming Interface (API). Conclusions: NBLAST offers a local alternative to the NCBI's remote Entrez system for precomputed BLAST alignments and neighbour queries. On our 216-processor 450 MHz Pill cluster, NBLAST requires similar to24 hrs to compute neighbours for 850000 proteins currently in the non-redundant protein database.
引用
收藏
页数:4
相关论文
共 5 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   BIND - The Biomolecular Interaction Network Database [J].
Bader, GD ;
Donaldson, I ;
Wolting, C ;
Ouellette, BFF ;
Pawson, T ;
Hogue, CWV .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :242-245
[3]  
DHARSEE M, 2000, 9 HET COMP WORKSH IE, P323
[4]  
MICHALIKOVA K, SEQHOUND BIOL SEQUEN
[5]  
Ostell J M, 2001, Methods Biochem Anal, V43, P19