The efficient computation of position-specific match scores with the fast Fourier transform

被引:21
作者
Rajasekaran, S
Jin, X
Spouge, JL [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, Bethesda, MD 20894 USA
[2] Univ Florida, Dept Comp & Informat Sci & Engn, Gainesville, FL 32611 USA
关键词
position-specific scoring matrices; fast Fourier transform; algorithm;
D O I
10.1089/10665270252833172
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Historically, in computational biology the fast Fourier transform (FFT) has been used almost exclusively to count the number of exact letter matches between two biosequences. This paper presents an FFT algorithm that can compute the match score of a sequence against a position-specific scoring matrix (PSSM). Our algorithm finds the PSSM score simultaneously over all offsets of the PSSM with the sequence, although like all previous FFT algorithms, it still disallows gaps. Although our algorithm is presented in the context of global matching, it can be adapted to local matching without gaps. As a benchmark, our PSSM-modified FFT algorithm computed pairwise match scores. In timing experiments, our most efficient FFT implementation for pairwise scoring appeared to be 10 to 26 times faster than a traditional FFT implementation, with only a factor of 2 in the acceleration attributable to a previously known compression scheme. Many import-ant algorithms for detecting biosequence similarities, e.g., gapped BLAST or PSIBLAST, have a heuristic screening phase that disallows gaps. This paper demonstrates that FFT algorithms merit reconsideration in these screening applications.
引用
收藏
页码:23 / 33
页数:11
相关论文
共 33 条
  • [1] Altschul S, 1999, SCIENTIST, V13, P15
  • [2] AMINO-ACID SUBSTITUTION MATRICES FROM AN INFORMATION THEORETIC PERSPECTIVE
    ALTSCHUL, SF
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1991, 219 (03) : 555 - 565
  • [3] ISSUES IN SEARCHING MOLECULAR SEQUENCE DATABASES
    ALTSCHUL, SF
    BOGUSKI, MS
    GISH, W
    WOOTTON, JC
    [J]. NATURE GENETICS, 1994, 6 (02) : 119 - 129
  • [4] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [5] A PROTEIN ALIGNMENT SCORING SYSTEM SENSITIVE AT ALL EVOLUTIONARY DISTANCES
    ALTSCHUL, SF
    [J]. JOURNAL OF MOLECULAR EVOLUTION, 1993, 36 (03) : 290 - 300
  • [6] Iterated profile searches with PSI-BLAST - a tool for discovery in protein databases
    Altschul, SF
    Koonin, EV
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (11) : 444 - 447
  • [7] [Anonymous], MATH METHODS DNA SEQ
  • [8] [Anonymous], 1978, Atlas of protein sequence and structure
  • [9] Barrett C, 1997, COMPUT APPL BIOSCI, V13, P191
  • [10] SEARCH OF HIDDEN PERIODICITIES IN DNA-SEQUENCES
    CHECHETKIN, VR
    TURYGIN, AY
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 1995, 175 (04) : 477 - 494