Assessment of the parallelization approach of d2_cluster for high-performance sequence clustering

被引:12
作者
Carpenter, JE
Christoffels, A
Weinbach, Y
Hide, WA [1 ]
机构
[1] Univ Western Cape, S African Natl Bioinformat Inst, Cape Town, South Africa
[2] SGI, Eagan, MN 55121 USA
[3] Biomedicom Ltd, IL-91487 Jerusalem, Israel
关键词
d2_cluster; sequence clustering; parallel algorithms; EST;
D O I
10.1002/jcc.10025
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The exponential increase in expressed sequence tag (EST) sequence data amplifies the computational cost of clustering sequences such that new algorithms are required to analyze data at a greater rate. We have pa-rallelized d2-cluster on a SGI Origin 2000 multiprocessor and observed a speedup of approximately 100X on 126 processors when processing a 15,876 EST dataset. The parallelized d2-cluster code is obtainable from the SANBI website (http://w,A,w.sanbi.ac.za/CODES).
引用
收藏
页码:755 / 757
页数:3
相关论文
共 9 条
[1]  
Amdahl G., 1967, AFIPS C P, V30, P483, DOI DOI 10.1145/1465482.1465560
[2]   GenBank [J].
Benson, DA ;
Karsch-Mizrachi, I ;
Lipman, DJ ;
Ostell, J ;
Rapp, BA ;
Wheeler, DL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :15-18
[3]   d2_cluster: A validated method for clustering EST and full-length cDNA sequences [J].
Burke, J ;
Davison, D ;
Hide, W .
GENOME RESEARCH, 1999, 9 (11) :1135-1142
[4]   STACK: Sequence Tag Alignment and Consensus Knowledgebase [J].
Christoffels, A ;
van Gelder, A ;
Greyling, G ;
Miller, R ;
Hide, T ;
Hide, W .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :234-238
[5]  
CHRISTOFFELS A, 2001, THESIS U W CAPE S AF
[6]  
Hide W, 1994, J Comput Biol, V1, P199, DOI 10.1089/cmb.1994.1.199
[7]   A comprehensive approach to clustering of expressed human gene sequence: The sequence tag alignment and consensus knowledge base [J].
Miller, RT ;
Christoffels, AG ;
Gopalakrishnan, C ;
Burke, J ;
Ptitsyn, AA ;
Broveak, TR ;
Hide, WA .
GENOME RESEARCH, 1999, 9 (11) :1143-1155
[8]   Chasing the dream: plant EST microarrays [J].
Richmond, T ;
Somerville, S .
CURRENT OPINION IN PLANT BIOLOGY, 2000, 3 (02) :108-116
[9]  
TORNEY DC, 1990, SFI S SCI C, V7, P109