Parallel pattern identification in biological sequences on clusters

被引:2
作者
Huang, CH [1 ]
Rajasekaran, S [1 ]
机构
[1] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA
关键词
cluster computing; parallel algorithm; tandem repeats;
D O I
10.1109/TNB.2003.810165
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Tandem repeats are ubiquitous sequence features in both prokaryotic and eukaryotic genomes. They are known to cause several inherited neurological diseases in humans. Identifying these patterns is a highly computation-intensive process. Previous parallel implementations use straightforward domain decomposition based on existing sequential algorithms and rely on parallel machines with low-latency interconnection network and fast hardware support for processor synchronization. Our research exploits the superior cost effectiveness and flexibility achieved through low-cost clusters to speed up biological computations by designing communication-efficient parallel algorithms for pattern identification. This paper presents a low communication-overhead parallel algorithm for pattern identification in biological sequences. Given a biological sequence of length n and a pattern of length m, we conclude an algorithm with five computation/communication phases, each requiring O(n) computation time and only O(p) message units. The low communication overhead of the algorithm is essential in achieving reasonable speedups on clusters, where the inter-processor communication latency is usually higher.
引用
收藏
页码:29 / 34
页数:6
相关论文
共 15 条
[1]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[2]   OPTIMAL DOUBLY LOGARITHMIC PARALLEL ALGORITHMS BASED ON FINDING ALL NEAREST SMALLER VALUES [J].
BERKMAN, O ;
SCHIEBER, B ;
VISHKIN, U .
JOURNAL OF ALGORITHMS, 1993, 14 (03) :344-370
[3]  
BRESLAUER D, 1990, SIAM J COMPUT, V19, P1050
[4]  
Dehne F, 1993, P ACM 9 ANN COMP GEO, P298
[5]  
FOSTER MJ, 1980, COMPUTER, V13, P26, DOI 10.1109/MC.1980.1653338
[6]   Communication efficient BSP algorithm all nearest smaller values problem [J].
He, X ;
Huang, CH .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2001, 61 (10) :1425-1438
[7]   A VERSATILE DATA STRING-SEARCH VLSI [J].
HIRATA, M ;
YAMADA, H ;
NAGAI, H ;
TAKAHASHI, K .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1988, 23 (02) :329-335
[8]  
JALA J, 1992, INTRO PARALLEL ALGOR
[9]   Incremental string comparison [J].
Landau, GM ;
Myers, EW ;
Schmidt, JP .
SIAM JOURNAL ON COMPUTING, 1998, 27 (02) :557-582
[10]   An algorithm for approximate tandem repeats [J].
Landau, GM ;
Schmidt, JP ;
Sokol, D .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (01) :1-18