Efficient identification of side-chain patterns using a multidimensional index tree

被引:22
作者
Hamelryck, T [1 ]
机构
[1] Free Univ Brussels VIB, ULTR Dept, B-1050 Brussels, Belgium
关键词
functional site; function from structure; mirror image; SR tree; structural bioinformatics; luciferase;
D O I
10.1002/prot.10338
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Convergent evolution often produces similar functional sites in nonhomologous proteins. The identification of these sites can make it possible to infer function from structure, to pinpoint the location of a functional site, to identify enzymes with similar enzymatic mechanisms, or to discover putative functional sites. In this article, a novel method is presented that (a) queries a database of protein structures for the occurrence of a given side chain pattern and (b) identifies interesting side-chain patterns in a given structure. For efficiency and to make a robust statistical evaluation of the significance of a similarity possible, patterns of three residues (or triads) are considered. Each triad is encoded as a high-dimensional vector and stored in an SR (Sphere/Rectangle) tree, an efficient multidimensional index tree. Identifying similar triads can then be reformulated as identifying neighboring vectors. The method deals with many features that otherwise complicate the identification of meaningful patterns: shifted backbone positions, conservative substitutions, various atom label ambiguities and mirror imaged geometries. The combined treatment of these features leads to the identification of previously unidentified patterns. In particular, the identification of mirror imaged side-chain patterns is unique to the here-described method. Interesting triads in a given structure can be identified by extracting all triads and comparing them with a database of triads involved in ligand binding. The approach was tested by an all-against-all comparison of unique representatives of all SCOP superfamilies. New findings include mirror imaged metal binding and active sites, and a putative active site in bacterial luciferase. Proteins 2003;51:96-108. (C) 2003 Wiley-Liss, Inc.
引用
收藏
页码:96 / 108
页数:13
相关论文
共 48 条
[1]   Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking [J].
Aloy, P ;
Querol, E ;
Aviles, FX ;
Sternberg, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (02) :395-408
[2]   A GRAPH-THEORETIC APPROACH TO THE IDENTIFICATION OF 3-DIMENSIONAL PATTERNS OF AMINO-ACID SIDE-CHAINS IN PROTEIN STRUCTURES [J].
ARTYMIUK, PJ ;
POIRRETTE, AR ;
GRINDLEY, HM ;
RICE, DW ;
WILLETT, P .
JOURNAL OF MOLECULAR BIOLOGY, 1994, 243 (02) :327-344
[3]   Combining evidence using p-values: application to sequence homology searches [J].
Bailey, TL ;
Gribskov, M .
BIOINFORMATICS, 1998, 14 (01) :48-54
[4]  
Barth A, 1994, Drug Des Discov, V12, P89
[5]   Mechanistic inferences from the crystal structure of fumarylacetoacetate hydrolase with a bound phosphorus-based inhibitors [J].
Bateman, RL ;
Bhanumoorthy, P ;
Witte, JF ;
McClard, RW ;
Grompe, M ;
Timm, DE .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2001, 276 (18) :15284-15291
[6]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[7]   The ASTRAL compendium for protein structure and sequence analysis [J].
Brenner, SE ;
Koehl, P ;
Levitt, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :254-256
[8]   Computational mapping identifies the binding sites of organic solvents on proteins [J].
Dennis, S ;
Kortvelyesi, T ;
Vajda, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (07) :4290-4295
[9]   Catalytic triads and their relatives [J].
Dodson, G ;
Wlodawer, A .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) :347-352
[10]   Prediction of functionally important residues based solely on the computed energetics of protein structure [J].
Elcock, AH .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 312 (04) :885-896