Genome-Scale Identification of Legionella pneumophila Effectors Using a Machine Learning Approach

被引:196
作者
Burstein, David [1 ]
Zusman, Tal [2 ]
Degtyar, Elena [2 ]
Viner, Ram [2 ]
Segal, Gil [2 ]
Pupko, Tal [1 ]
机构
[1] Tel Aviv Univ, George S Wise Fac Life Sci, Dept Cell Res & Immunol, Ramat Aviv, Israel
[2] Tel Aviv Univ, George S Wise Fac Life Sci, Dept Mol Microbiol & Biotechnol, Ramat Aviv, Israel
关键词
TERMINAL TRANSLOCATION SIGNAL; IV SECRETION; HOST-CELLS; PROTEINS; SYSTEM; REGION; REGULATOR; GENES;
D O I
10.1371/journal.ppat.1000508
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
A large number of highly pathogenic bacteria utilize secretion systems to translocate effector proteins into host cells. Using these effectors, the bacteria subvert host cell processes during infection. Legionella pneumophila translocates effectors via the Icm/Dot type-IV secretion system and to date, approximately 100 effectors have been identified by various experimental and computational techniques. Effector identification is a critical first step towards the understanding of the pathogenesis system in L. pneumophila as well as in other bacterial pathogens. Here, we formulate the task of effector identification as a classification problem: each L. pneumophila open reading frame (ORF) was classified as either effector or not. We computationally defined a set of features that best distinguish effectors from non-effectors. These features cover a wide range of characteristics including taxonomical dispersion, regulatory data, genomic organization, similarity to eukaryotic proteomes and more. Machine learning algorithms utilizing these features were then applied to classify all the ORFs within the L. pneumophila genome. Using this approach we were able to predict and experimentally validate 40 new effectors, reaching a success rate of above 90%. Increasing the number of validated effectors to around 140, we were able to gain novel insights into their characteristics. Effectors were found to have low G+C content, supporting the hypothesis that a large number of effectors originate via horizontal gene transfer, probably from their protozoan host. In addition, effectors were found to cluster in specific genomic regions. Finally, we were able to provide a novel description of the C-terminal translocation signal required for effector translocation by the Icm/Dot secretion system. To conclude, we have discovered 40 novel L. pneumophila effectors, predicted over a hundred additional highly probable effectors, and shown the applicability of machine learning algorithms for the identification and characterization of bacterial pathogenesis determinants.
引用
收藏
页数:12
相关论文
共 51 条
[1]   The response regulator CpxR directly regulates expression of several Legionella pneumophila icm/dot components as well as new translocated substrates [J].
Altman, Efrat ;
Segal, Gil .
JOURNAL OF BACTERIOLOGY, 2008, 190 (06) :1985-1996
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
[Anonymous], Data Mining Practical Machine Learning Tools and Techniques with Java
[4]   Type IV secretion systems and their effectors in bacterial pathogenesis [J].
Backert, S ;
Meyer, TF .
CURRENT OPINION IN MICROBIOLOGY, 2006, 9 (02) :207-217
[5]   IcmS-dependent translocation of SdeA into macrophages by the Legionella pneumophila type IV secretion system [J].
Bardill, JP ;
Miller, JL ;
Vogel, JP .
MOLECULAR MICROBIOLOGY, 2005, 56 (01) :90-103
[6]   Adaptation of Legionella pneumophila to the host environment:: role of protein secretion, effectors and eukaryotic-like proteins [J].
Brüggemann, H ;
Cazalet, C ;
Buchrieser, C .
CURRENT OPINION IN MICROBIOLOGY, 2006, 9 (01) :86-94
[7]   Virulence strategies for infecting phagocytes deduced from the in vivo transcriptional program of Legionella pneumophila [J].
Bruggemann, Holger ;
Hagman, Arne ;
Jules, Matthieu ;
Sismeiro, Odile ;
Dillies, Marie-Agnes ;
Gouyette, Catherine ;
Kunst, Frank ;
Steinert, Michael ;
Heuner, Klaus ;
Coppee, Jean-Yves ;
Buchrieser, Carmen .
CELLULAR MICROBIOLOGY, 2006, 8 (08) :1228-1240
[8]   A yeast genetic system for the identification and characterization of substrate proteins transferred into host cells by the Legionella pneumophila Dot/lcm system [J].
Campodonico, EM ;
Chesnel, L ;
Roy, CR .
MOLECULAR MICROBIOLOGY, 2005, 56 (04) :918-933
[9]   INVITRO GENE FUSIONS THAT JOIN AN ENZYMATICALLY ACTIVE BETA-GALACTOSIDASE SEGMENT TO AMINO-TERMINAL FRAGMENTS OF EXOGENOUS PROTEINS - ESCHERICHIA-COLI PLASMID VECTORS FOR THE DETECTION AND CLONING OF TRANSLATIONAL INITIATION SIGNALS [J].
CASADABAN, MJ ;
CHOU, J ;
COHEN, SN .
JOURNAL OF BACTERIOLOGY, 1980, 143 (02) :971-980
[10]   The type VI secretion toolkit [J].
Cascales, Eric .
EMBO REPORTS, 2008, 9 (08) :735-741