Analysis and prediction of functionally important sites in proteins

被引:35
作者
Chakrabarti, Saikat [1 ]
Lanczycki, Christopher J. [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotechnol Informat, NIH, Bethesda, MD 20894 USA
关键词
functionally important sites; function prediction; active sites; metal binding sites; protein binding sites; ligand binding sites; evolutionary conservation; compositional pattern; GENE ONTOLOGY; SEQUENCE; EVOLUTION; RESIDUES; DATABASE; ALIGNMENTS; STRAIN; DOMAIN; COMMON; TOOL;
D O I
10.1110/ps.062506407
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The rapidly increasing volume of sequence and structure information available for proteins poses the daunting task of determining their functional importance. Computational methods can prove to be very useful in understanding and characterizing the biochemical and evolutionary information contained in this wealth of data, particularly at functionally important sites. Therefore, we perform a detailed survey of compositional and evolutionary constraints at the molecular and biological function level for a large set of known functionally important sites extracted from a wide range of protein families. We compare the degree of conservation across different functional categories and provide detailed statistical insight to decipher the varying evolutionary constraints at functionally important sites. The compositional and evolutionary information at functionally important sites has been compiled into a library of functional templates. We developed a module that predicts functionally important columns (FIC) of an alignment based on the detection of a significant "template match score'' to a library template. Our template match score measures an alignment column's similarity to a library template and combines a term explicitly representing a column's residue composition with various evolutionary conservation scores (information content and position-specific scoring matrix-derived statistics). Our benchmarking studies show good sensitivity/ specificity for the prediction of functional sites and high accuracy in attributing correct molecular function type to the predicted sites. This prediction method is based on information derived from homologous sequences and no structural information is required. Therefore, this method could be extremely useful for large-scale functional annotation.
引用
收藏
页码:4 / 13
页数:10
相关论文
共 43 条
[1]   Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking [J].
Aloy, P ;
Querol, E ;
Aviles, FX ;
Sternberg, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (02) :395-408
[2]   Emergence of diverse biochemical activities in evolutionarily conserved structural scaffolds of proteins [J].
Anantharaman, V ;
Aravind, L ;
Koonin, EV .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2003, 7 (01) :12-20
[3]   Classification of protein families and detection of the determinant residues with an improved self-organizing map [J].
Andrade, MA ;
Casari, G ;
Sander, C ;
Valencia, A .
BIOLOGICAL CYBERNETICS, 1997, 76 (06) :441-450
[4]   Trends in protein evolution inferred from sequence and structure analysis [J].
Aravind, L ;
Mazumder, R ;
Vasudevan, S ;
Koonin, EV .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 2002, 12 (03) :392-399
[5]   ConSurf: An algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information [J].
Armon, A ;
Graur, D ;
Ben-Tal, N .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (01) :447-463
[6]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[7]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkh121, 10.1093/nar/gkp985]
[8]   Studies on the metal binding sites in the catalytic domain of β1,4-galactosyltransferase [J].
Boeggeman, E ;
Qasba, PK .
GLYCOBIOLOGY, 2002, 12 (07) :395-407
[9]   A METHOD TO PREDICT FUNCTIONAL RESIDUES IN PROTEINS [J].
CASARI, G ;
SANDER, C ;
VALENCIA, A .
NATURE STRUCTURAL BIOLOGY, 1995, 2 (02) :171-178
[10]   Refining multiple sequence alignments with conserved core regions [J].
Chakrabarti, Saikat ;
Lanczycki, Christopher J. ;
Panchenko, Anna R. ;
Przytycka, Teresa M. ;
Thiessen, Paul A. ;
Bryant, Stephen H. .
NUCLEIC ACIDS RESEARCH, 2006, 34 (09) :2598-2606