Parallel pairwise statistical significance estimation of local sequence alignment using Message Passing Interface library

被引:3
作者
Agrawal, Ankit [1 ]
Misra, Sanchit [1 ]
Honbo, Daniel [1 ]
Choudhary, Alok [1 ]
机构
[1] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
关键词
homologs; MPI; pairwise statistical significance; non-conservative pairwise statistical significance; parallel computing; sequence alignment; sequence-specific substitution matrix; position-specific substitution matrix; MULTIPLE PARAMETER SETS; DATABASE SEARCHES; PSI-BLAST; SIMILARITY; ACCURACY;
D O I
10.1002/cpe.1798
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Homology detection is a fundamental step in sequence analysis. In the recent years, pairwise statistical significance has emerged as a promising alternative to database statistical significance for homology detection. Although more accurate, currently it is much time consuming because it involves generating tens of hundreds of alignment scores to construct the empirical score distribution. This paper presents a parallel algorithm for pairwise statistical significance estimation, called MPIPairwiseStatSig, implemented in C using MPI library. We further apply the parallelization technique to estimate non-conservative pairwise statistical significance using standard, sequence-specific, and position-specific substitution matrices, which has earlier demonstrated superior sequence comparison accuracy than original pairwise statistical significance. Distributing the most compute-intensive portions of the pairwise statistical significance estimation procedure across multiple processors has been shown to result in near-linear speed-ups for the application. The MPIPairwiseStatSig program for pairwise statistical significance estimation is available for free academic use at www.cs.iastate.edu/similar to ankitag/MPIPairwiseStatSig.html. Copyright (C) 2011 John Wiley & Sons, Ltd.
引用
收藏
页码:2269 / 2279
页数:11
相关论文
共 47 条
[1]  
Agrawal A, 2010, P 1 ACM INT C BIOINF, P312
[2]  
Agrawal A, 2008, P IEEE INT C EIT, P457
[3]  
Agrawal A, 2008, P ACM 2 INT WORKSH D, P53
[4]  
Agrawal A., 2010, P 19 ACM INT S HIGH, P470, DOI [DOI 10.1145/1851476.1851545, 10.1145/1851476.1851545]
[5]  
Agrawal A, 2010, P BIOCOMP 2010, P262
[6]  
Agrawal A, 2008, LECT N BIOINFORMAT, V4983, P62, DOI 10.1007/978-3-540-79450-9_7
[7]  
Agrawal A, 2008, LECT N BIOINFORMAT, V4983, P50, DOI 10.1007/978-3-540-79450-9_6
[8]   Sequence-Specific Sequence Comparison Using Pairwise Statistical Significance [J].
Agrawal, Ankit ;
Choudhary, Alok ;
Huang, Xiaoqiu .
SOFTWARE TOOLS AND ALGORITHMS FOR BIOLOGICAL SYSTEMS, 2011, 696 :297-306
[9]   Pairwise Statistical Significance of Local Sequence Alignment Using Sequence-Specific and Position-Specific Substitution Matrices [J].
Agrawal, Ankit ;
Huang, Xiaoqiu .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (01) :194-205
[10]   Conservative, Non-Conservative and Average Pairwise Statistical Significance of Local Sequence Alignment [J].
Agrawal, Ankit ;
Huang, Xiaoqiu .
2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS, 2008, :433-436