Classifying RNA-Binding Proteins Based on Electrostatic Properties

被引:68
作者
Shazman, Shula [1 ]
Mandel-Gutfreund, Yael [1 ]
机构
[1] Technion Israel Inst Technol, Fac Biol, Haifa, Israel
关键词
D O I
10.1371/journal.pcbi.1000146
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein structure can provide new insight into the biological function of a protein and can enable the design of better experiments to learn its biological roles. Moreover, deciphering the interactions of a protein with other molecules can contribute to the understanding of the protein's function within cellular processes. In this study, we apply a machine learning approach for classifying RNA-binding proteins based on their three-dimensional structures. The method is based on characterizing unique properties of electrostatic patches on the protein surface. Using an ensemble of general protein features and specific properties extracted from the electrostatic patches, we have trained a support vector machine (SVM) to distinguish RNA-binding proteins from other positively charged proteins that do not bind nucleic acids. Specifically, the method was applied on proteins possessing the RNA recognition motif (RRM) and successfully classified RNA-binding proteins from RRM domains involved in protein-protein interactions. Overall the method achieves 88% accuracy in classifying RNA-binding proteins, yet it cannot distinguish RNA from DNA binding proteins. Nevertheless, by applying a multiclass SVM approach we were able to classify the RNA-binding proteins based on their RNA targets, specifically, whether they bind a ribosomal RNA (rRNA), a transfer RNA (tRNA), or messenger RNA (mRNA). Finally, we present here an innovative approach that does not rely on sequence or structural homology and could be applied to identify novel RNA-binding proteins with unique folds and/or binding motifs.
引用
收藏
页数:14
相关论文
共 70 条
[1]   Moment-based prediction of DNA-binding proteins [J].
Ahmad, S ;
Sarai, A .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 341 (01) :65-71
[2]   Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information [J].
Ahmad, S ;
Gromiha, MM ;
Sarai, A .
BIOINFORMATICS, 2004, 20 (04) :477-486
[3]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[4]  
[Anonymous], R STATS PACKAGE
[5]   Ultraconserved elements in the human genome [J].
Bejerano, G ;
Pheasant, M ;
Makunin, I ;
Stephen, S ;
Kent, WJ ;
Mattick, JS ;
Haussler, D .
SCIENCE, 2004, 304 (5675) :1321-1325
[6]   Kernel-based machine learning protocol for predicting DNA-binding proteins [J].
Bhardwaj, N ;
Langlois, RE ;
Zhao, GJ ;
Lu, H .
NUCLEIC ACIDS RESEARCH, 2005, 33 (20) :6486-6493
[7]   Predicting protein-protein interactions from primary structure [J].
Bock, JR ;
Gough, DA .
BIOINFORMATICS, 2001, 17 (05) :455-460
[8]   Molecular insights into the interaction of PYM with the Mago-Y14 core of the exon junction complex [J].
Bono, F ;
Ebert, J ;
Unterholzner, L ;
Guttler, T ;
Izaurralde, E ;
Conti, E .
EMBO REPORTS, 2004, 5 (03) :304-310
[9]  
Brock P.D., 2017, PISCES
[10]   The social life of ribosomal proteins [J].
Brodersen, DE ;
Nissen, P .
FEBS JOURNAL, 2005, 272 (09) :2098-2108