CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats

被引:693
作者
Bland, Charles [1 ]
Ramsey, Teresa L.
Sabree, Fareedah
Lowe, Micheal
Brown, Kyndall
Kyrpides, Nikos C.
Hugenholtz, Philip
机构
[1] Jackson State Univ, Dept Comp Sci, Jackson, MS 39217 USA
[2] DOE Joint Genome Inst, Walnut Creek, CA 94598 USA
[3] Univ Nebraska, Dept Comp Sci & Engn, Lincoln, NE 68504 USA
关键词
D O I
10.1186/1471-2105-8-209
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Clustered Regularly Interspaced Palindromic Repeats ( CRISPRs) are a novel type of direct repeat found in a wide range of bacteria and archaea. CRISPRs are beginning to attract attention because of their proposed mechanism; that is, defending their hosts against invading extrachromosomal elements such as viruses. Existing repeat detection tools do a poor job of identifying CRISPRs due to the presence of unique spacer sequences separating the repeats. In this study, a new tool, CRT, is introduced that rapidly and accurately identifies CRISPRs in large DNA strings, such as genomes and metagenomes. Results: CRT was compared to CRISPR detection tools, Patscan and Pilercr. In terms of correctness, CRT was shown to be very reliable, demonstrating significant improvements over Patscan for measures precision, recall and quality. When compared to Pilercr, CRT showed improved performance for recall and quality. In terms of speed, CRT proved to be a huge improvement over Patscan. Both CRT and Pilercr were comparable in speed, however CRT was faster for genomes containing large numbers of repeats. Conclusion: In this paper a new tool was introduced for the automatic detection of CRISPR elements. This tool, CRT, showed some important improvements over current techniques for CRISPR identification. CRT's approach to detecting repetitive sequences is straightforward. It uses a simple sequential scan of a DNA sequence and detects repeats directly without any major conversion or preprocessing of the input. This leads to a program that is easy to describe and understand; yet it is very accurate, fast and memory efficient, being O( n) in space and O( nm/ l) in time.
引用
收藏
页数:8
相关论文
共 31 条
[1]  
Achaz G, 2003, GENETICS, V164, P1279
[2]   CRISPR provides acquired resistance against viruses in prokaryotes [J].
Barrangou, Rodolphe ;
Fremaux, Christophe ;
Deveau, Helene ;
Richards, Melissa ;
Boyaval, Patrick ;
Moineau, Sylvain ;
Romero, Dennis A. ;
Horvath, Philippe .
SCIENCE, 2007, 315 (5819) :1709-1712
[3]   A SPACE EFFICIENT ALGORITHM FOR FINDING THE BEST NONOVERLAPPING ALIGNMENT SCORE [J].
BENSON, G .
THEORETICAL COMPUTER SCIENCE, 1995, 145 (1-2) :357-369
[4]   Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin [J].
Bolotin, A ;
Ouinquis, B ;
Sorokin, A ;
Ehrlich, SD .
MICROBIOLOGY-SGM, 2005, 151 :2551-2561
[5]   FAST STRING SEARCHING ALGORITHM [J].
BOYER, RS ;
MOORE, JS .
COMMUNICATIONS OF THE ACM, 1977, 20 (10) :762-772
[6]  
BRODAL GS, J DISCRETE ALGORITHM, V1, P77
[7]   Searching for patterns in genomic data [J].
Dsouza, M ;
Larsen, N ;
Overbeek, R .
TRENDS IN GENETICS, 1997, 13 (12) :497-498
[8]   PILER: identification and classification of genomic repeats [J].
Edgar, RC ;
Myers, EW .
BIOINFORMATICS, 2005, 21 :I152-I158
[9]   PILER-CR: Fast and accurate identification of CRISPR repeats [J].
Edgar, Robert C. .
BMC BIOINFORMATICS, 2007, 8 (1)
[10]   The repetitive DNA elements called CRISPRs and their associated genes: Evidence of horizontal transfer among prokaryotes [J].
Godde, James S. ;
Bickerton, Amanda .
JOURNAL OF MOLECULAR EVOLUTION, 2006, 62 (06) :718-729