PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies

被引:374
作者
Akhter, Sajia [1 ]
Aziz, Ramy K. [2 ,3 ]
Edwards, Robert A. [1 ,2 ,4 ]
机构
[1] San Diego State Univ, Computat Sci Res Ctr, San Diego, CA 92182 USA
[2] San Diego State Univ, Dept Comp Sci, San Diego, CA 92182 USA
[3] Cairo Univ, Fac Pharm, Dept Microbiol & Immunol, Cairo 11562, Egypt
[4] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA
基金
美国国家科学基金会;
关键词
STAPHYLOCOCCUS-AUREUS STRAIN; ESCHERICHIA-COLI; GENETIC ELEMENTS; SEQUENCE; IDENTIFICATION; EVOLUTION; CLASSIFICATION; VIRULENCE; PATHOGEN; INSIGHTS;
D O I
10.1093/nar/gks406
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Prophages are phages in lysogeny that are integrated into, and replicated as part of, the host bacterial genome. These mobile elements can have tremendous impact on their bacterial hosts' genomes and phenotypes, which may lead to strain emergence and diversification, increased virulence or antibiotic resistance. However, finding prophages in microbial genomes remains a problem with no definitive solution. The majority of existing tools rely on detecting genomic regions enriched in protein-coding genes with known phage homologs, which hinders the de novo discovery of phage regions. In this study, a weighted phage detection algorithm, PhiSpy was developed based on seven distinctive characteristics of prophages, i.e. protein length, transcription strand directionality, customized AT and GC skew, the abundance of unique phage words, phage insertion points and the similarity of phage proteins. The first five characteristics are capable of identifying prophages without any sequence similarity with known phage genes. PhiSpy locates prophages by ranking genomic regions enriched in distinctive phage traits, which leads to the successful prediction of 94% of prophages in 50 complete bacterial genomes with a 6% false-negative rate and a 0.66% false-positive rate.
引用
收藏
页数:13
相关论文
共 54 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Mosaic prophages with horizontally acquired genes account for the emergence and diversification of the globally disseminated M1T1 clone of Streptococcus pyogenes [J].
Aziz, RK ;
Edwards, RA ;
Taylor, WW ;
Low, DE ;
McGeer, A ;
Kotb, M .
JOURNAL OF BACTERIOLOGY, 2005, 187 (10) :3311-3318
[3]   Post-proteomic identification of a novel phage-encoded streptodornase, Sda1, in invasive M1T1 Streptococcus pyogenes [J].
Aziz, RK ;
Ismail, SA ;
Park, HW ;
Kotb, M .
MOLECULAR MICROBIOLOGY, 2004, 54 (01) :184-197
[4]   Genome sequence of Staphylococcus aureus strain newman and comparative analysis of staphylococcal genomes:: Polymorphism and evolution of two major pathogenicity islands [J].
Baba, Tadashi ;
Bae, Taeok ;
Schneewind, Olaf ;
Takeuchi, Fumihiko ;
Hiramatsu, Keiichi .
JOURNAL OF BACTERIOLOGY, 2008, 190 (01) :300-310
[5]   Viral adaptation to host: a proteome-based analysis of codon usage and amino acid preferences [J].
Bahir, Iris ;
Fromer, Menachem ;
Prat, Yosef ;
Linial, Michal .
MOLECULAR SYSTEMS BIOLOGY, 2009, 5
[6]   Contribution of Exogenous Genetic Elements to the Group A Streptococcus Metagenome [J].
Beres, Stephen B. ;
Musser, James M. .
PLOS ONE, 2007, 2 (08)
[7]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Genome plasticity of BCG and impact on vaccine efficacy [J].
Brosch, Roland ;
Gordon, Stephen V. ;
Garnier, Thierry ;
Eiglmeier, Karin ;
Frigui, Wafa ;
Valenti, Philippe ;
Dos Santos, Sandrine ;
Duthoy, Stephanie ;
Lacroix, Celine ;
Garcia-Pelayo, Carmen ;
Inwald, Jacqueline K. ;
Golby, Paul ;
Garcia, Javier Nunez ;
Hewinson, R. Glyn ;
Behr, Marcel A. ;
Quail, Michael A. ;
Churcher, Carol ;
Barrell, Bart G. ;
Parkhill, Julian ;
Cole, Stewart T. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (13) :5596-5601
[10]   LAMBDOID PHAGES AS ELEMENTS OF BACTERIAL GENOMES (INTEGRASE PHAGE21 ESCHERICHIA-COLI K-12 ICD GENE) [J].
CAMPBELL, A ;
SCHNEIDER, SJ ;
SONG, B .
GENETICA, 1992, 86 (1-3) :259-267