PatternHunter: faster and more sensitive homology search

被引:521
作者
Ma, B [1 ]
Tromp, J
Li, M
机构
[1] Univ Western Ontario, Dept Comp Sci, London, ON N6A 5B8, Canada
[2] Bioinformat Solut Inc, Waterloo, ON N2L 3L2, Canada
[3] Univ Waterloo, Dept Comp Sci, Waterloo, ON N2L 3G1, Canada
[4] Univ Calif Santa Barbara, Dept Comp Sci, Bioinformat Lab, Santa Barbara, CA 93106 USA
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1093/bioinformatics/18.3.440
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Genomics and proteomics studies routinely depend on homology searches based on the strategy of finding short seed matches which are then extended. The exploding genomic data growth presents a dilemma for DNA homology search techniques: increasing seed size decreases sensitivity whereas decreasing seed size slows down computation. Results: We present a new homology search algorithm 'PatternHunter' that uses a novel seed model for increased sensitivity and new hit-processing techniques for significantly increased speed. At Blast levels of sensitivity, PatternHunter is able to find homologies between sequences as large as human chromosomes, in mere hours on a desktop.
引用
收藏
页码:440 / 445
页数:6
相关论文
共 16 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [3] [Anonymous], 1998, STOC
  • [4] Efficient large-scale sequence comparison by locality-sensitive hashing
    Buhler, J
    [J]. BIOINFORMATICS, 2001, 17 (05) : 419 - 428
  • [5] BUHLER J, 2002, IN PRESS J COMPUT BI
  • [6] BURKHARDT S, 1999, 3 ANN INT C COMP MOL, P11
  • [7] CALIFANO A, 1995, FLASH FAST LOOK ALGO
  • [8] Alignment of whole genomes
    Delcher, AL
    Kasif, S
    Fleischmann, RD
    Peterson, J
    White, O
    Salzberg, SL
    [J]. NUCLEIC ACIDS RESEARCH, 1999, 27 (11) : 2369 - 2376
  • [9] GISH W, WU BLAST 2 0
  • [10] A TIME-EFFICIENT, LINEAR-SPACE LOCAL SIMILARITY ALGORITHM
    HUANG, XQ
    MILLER, W
    [J]. ADVANCES IN APPLIED MATHEMATICS, 1991, 12 (03) : 337 - 357