Predicting zinc binding at the proteome level

被引:90
作者
Passerini, Andrea [1 ]
Andreini, Claudia
Menchetti, Sauro
Rosato, Antonio
Frasconi, Paolo
机构
[1] Univ Florence, Machine Learning & Neutral Networks Grp, Dipartimento Sistemi & Informat, Florence, Italy
[2] CERM, Florence, Italy
[3] Univ Florence, Dipartimento Chim, Florence, Italy
来源
BMC BIOINFORMATICS | 2007年 / 8卷
关键词
STRUCTURAL GENOMICS; PROTEINS; METALLOPROTEINS; CYSTEINES; DATABASE; SEARCH;
D O I
10.1186/1471-2105-8-39
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Metalloproteins are proteins capable of binding one or more metal ions, which may be required for their biological function, for regulation of their activities or for structural purposes. Metal-binding properties remain difficult to predict as well as to investigate experimentally at the whole-proteome level. Consequently, the current knowledge about metalloproteins is only partial. Results: The present work reports on the development of a machine learning method for the prediction of the zinc-binding state of pairs of nearby amino-acids, using predictors based on support vector machines. The predictor was trained using chains containing zinc-binding sites and non-metalloproteins in order to provide positive and negative examples. Results based on strong non-redundancy tests prove that (1) zinc-binding residues can be predicted and (2) modelling the correlation between the binding state of nearby residues significantly improves performance. The trained predictor was then applied to the human proteome. The present results were in good agreement with the outcomes of previous, highly manually curated, efforts for the identification of human zinc-binding proteins. Some unprecedented zinc-binding sites could be identified, and were further validated through structural modelling. The software implementing the predictor is freely available at: http://zincfinder.dsi.unifi.it. Conclusion: The proposed approach constitutes a highly automated tool for the identification of metalloproteins, which provides results of comparable quality with respect to highly manually refined predictions. The ability to model correlations between pairwise residues allows it to obtain a significant improvement over standard ID based approaches. In addition, the method permits the identification of unprecedented metal sites, providing important hints for the work of experimentalists.
引用
收藏
页数:13
相关论文
共 31 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   A hint to search for metalloproteins in gene banks [J].
Andreini, C ;
Bertini, I ;
Rosato, A .
BIOINFORMATICS, 2004, 20 (09) :1373-1380
[3]   Counting the zinc-proteins encoded in the human genome [J].
Andreini, C ;
Banci, L ;
Bertini, I ;
Rosato, A .
JOURNAL OF PROTEOME RESEARCH, 2006, 5 (01) :196-201
[4]  
[Anonymous], 2000, ADV LARGE MARGIN CLA
[5]   The Protein Data Bank [J].
Berman, HM ;
Battistuz, T ;
Bhat, TN ;
Bluhm, WF ;
Bourne, PE ;
Burkhardt, K ;
Iype, L ;
Jain, S ;
Fagan, P ;
Marvin, J ;
Padilla, D ;
Ravichandran, V ;
Schneider, B ;
Thanki, N ;
Weissig, H ;
Westbrook, JD ;
Zardecki, C .
ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY, 2002, 58 :899-907
[6]   Bioinorganic chemistry in the postgenomic era [J].
Bertini, I ;
Rosato, A .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (07) :3601-3604
[7]  
Bertini I., 2001, HDB METALLOPROTEINS, DOI DOI 10.1201/9781482270822
[8]   Structure of D-allose binding protein from Escherichia coli bound to D-allose at 1.8 Å resolution [J].
Chaudhuri, BN ;
Ko, J ;
Park, C ;
Jones, TA ;
Mowbray, SL .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 286 (05) :1519-1531
[9]  
Cortes C., 1995, Machine Learning, V20, P273, DOI [DOI 10.1007/BF00994018, DOI 10.1023/A:1022627411411]
[10]   A high throughput method for the detection of metalloproteins on a microgram scale [J].
Högbom, M ;
Ericsson, UB ;
Lam, R ;
Bakali, MA ;
Kuznetsova, E ;
Nordlund, P ;
Zamble, DB .
MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (06) :827-834