Functional sites in protein families uncovered via an objective and automated graph theoretic approach

被引:82
作者
Wangikar, PP [1 ]
Tendulkar, AV
Ramya, S
Mail, DN
Sarawagi, S
机构
[1] Indian Inst Technol, Dept Chem Engn, Bombay 400076, Maharashtra, India
[2] Indian Inst Technol, Kanwal Rekhi Sch Informat Technol, Bombay 400076, Maharashtra, India
关键词
catalytic tetrad; active site; backtracking algorithm; branch and bound technique; protein structure;
D O I
10.1016/S0022-2836(02)01384-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We report a method for detection of recurring side-chain patterns (DRESPAT) using an unbiased and automated graph theoretic approach. We first list all structural patterns as sub-graphs where the protein is represented as a graph. The patterns from proteins are compared pairwise to detect patterns common to a protein pair based on content and geometry criteria. The recurring pattern is then detected using an automated search algorithm from the all-against-all pair-wise comparison data of proteins. Intra-protein pattern comparison data are used to enable detection of patterns recurring within a protein. A method has been proposed for empirical calculation of statistical significance of recurring pattern. The method was tested on 17 protein sets of varying size, composed of non-redundant representatives from SCOP superfamilies. Recurring patterns in serine proteases, cysteine proteases, lipases, cupredoxin, ferredoxin, ferritin, cytochrome c, aspartoyl proteases, peroxidases, phospholipase A2, endonuclease, SH3 domain, EF-hand and lectins show additional residues conserved in the vicinity of the known functional sites. On the basis of the recurring patterns in ferritin, EF-hand and lectins, we could separate proteins or domains that are structurally similar yet different in metal ion-binding characteristics. In addition, novel recurring patterns were observed in glutathione-S-transferase, phospholipase A2 and ferredoxin with potential structural/functional roles. The results are discussed in relation to the known functional sites in each family. Between 2000 and 50,000 patterns were enumerated from each protein with between ten and 500 patterns detected as common to an evolutionarily related protein pair. Our results show that unbiased extraction of functional site pattern is not feasible from an evolutionarily related protein pair but is feasible from protein sets comprising five or more proteins. The DRESPAT method does not require a user-defined size or location of the pattern and therefore, has the potential to uncover new functional sites in protein families. (C) 2003 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:955 / 978
页数:24
相关论文
共 119 条
[1]   Evaluation of the role of two conserved active-site residues in Beta class glutathione S-transferases [J].
Allocati, N ;
Casalone, E ;
Masulli, M ;
Polekhina, G ;
Rossjohn, J ;
Parker, MW ;
Di Ilio, C .
BIOCHEMICAL JOURNAL, 2000, 351 :341-346
[2]   Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking [J].
Aloy, P ;
Querol, E ;
Aviles, FX ;
Sternberg, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (02) :395-408
[3]   Three-dimensional structure of guanylyl cyclase activating protein-2, a calcium-sensitive modulator of photoreceptor guanylyl cyclases [J].
Ames, JB ;
Dizhoor, AM ;
Ikura, M ;
Palczewski, K ;
Stryer, L .
JOURNAL OF BIOLOGICAL CHEMISTRY, 1999, 274 (27) :19329-19337
[4]   A GRAPH-THEORETIC APPROACH TO THE IDENTIFICATION OF 3-DIMENSIONAL PATTERNS OF AMINO-ACID SIDE-CHAINS IN PROTEIN STRUCTURES [J].
ARTYMIUK, PJ ;
POIRRETTE, AR ;
GRINDLEY, HM ;
RICE, DW ;
WILLETT, P .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 243 (02) :327-344
[5]  
ARUGA J, 1994, J NEUROCHEM, V63, P1880
[6]   Identification and characterization of Zic4, a new member of the mouse Zic gene family [J].
Aruga, J ;
Yozu, A ;
Hayashizaki, Y ;
Okazaki, Y ;
Chapman, VM ;
Mikoshiba, K .
GENE, 1996, 172 (02) :291-294
[7]   PRINTS-S: the database formerly known as PRINTS [J].
Attwood, TK ;
Croning, MDR ;
Flower, DR ;
Lewis, AP ;
Mabey, JE ;
Scordis, P ;
Selley, JN ;
Wright, W .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :225-227
[8]  
ATTWOOD TK, 1994, NUCLEIC ACIDS RES, V22, P3590
[9]   The 2.2 Å resolution structure of the O(H) blood-group-specific lectin I from Ulex europaeus [J].
Audette, GF ;
Vandonselaar, M ;
Delbaere, LTJ .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 304 (03) :423-433
[10]   Structure and function of the Bacillus hybrid enzyme GluXyn-1:: Native-like jellyroll fold preserved after insertion of autonomous globular domain [J].
Ay, J ;
Götz, F ;
Borriss, R ;
Heinemann, U .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (12) :6613-6618