Conserved spatially interacting motifs of protein superfamilies: Application to fold recognition and function annotation of genome data

被引:17
作者
Bhaduri, A
Ravishankar, R
Sowdhamini, R
机构
[1] Univ Agr Sci Bangalore, Natl Ctr Biol Sci, Tata Inst Fundamental Res, Bangalore 560065, Karnataka, India
[2] Anna Univ, Ctr Biotechnol, Madras 600025, Tamil Nadu, India
关键词
distance relationship; BLAST; structure prediction; function prediction; sequence searches; genome analysis;
D O I
10.1002/prot.10638
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Limitations in techniques for the elucidation of protein function have led to an increasing gap between the annotated proteins and those encoded in a genome. The functional selection and three-dimensional structural constraints of proteins in nature often relate to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. We identify spatially interacting conserved regions, or motifs, within protein superfamilies that are critical for structure and/or function. A search in sequence databases using these descriptors as additional constraints is an approach to identifying putative additional members of superfamilies. Such constrained searches have been tested against proteins of known structure to demonstrate high percentage specificity (93) with a low error rate of 0.0004. This approach has been compared with other sensitive sequence search methods (e.g., PSI-BLAST, HMMsearch, and IMPALA). It has been extended to analyze the distribution of 11 superfamilies in 93 genomes, including the human genome. (C) 2004 Wiley-Liss, Inc.
引用
收藏
页码:657 / 670
页数:14
相关论文
共 48 条
[31]   Highly specific protein sequence motifs for genome analysis [J].
Nevill-Manning, CG ;
Wu, TD ;
Brutlag, DL .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (11) :5865-5871
[32]   RADIAL LOCATIONS OF AMINO-ACID-RESIDUES IN A GLOBULAR PROTEIN - CORRELATION WITH THE SEQUENCE [J].
NISHIKAWA, K ;
OOI, T .
JOURNAL OF BIOCHEMISTRY, 1986, 100 (04) :1043-1047
[33]   Interaction of two proline-rich sequences of cell adhesion kinase β with SH3 domains of p130 Cas-related proteins and a GTPase-activating protein, Graf [J].
Ohba, T ;
Ishino, M ;
Aoto, H ;
Sasaki, T .
BIOCHEMICAL JOURNAL, 1998, 330 :1249-1254
[34]   CATH - a hierarchic classification of protein domain structures [J].
Orengo, CA ;
Michie, AD ;
Jones, S ;
Jones, DT ;
Swindells, MB ;
Thornton, JM .
STRUCTURE, 1997, 5 (08) :1093-1108
[35]   TERTIARY STRUCTURAL CONSTRAINTS ON PROTEIN EVOLUTIONARY DIVERSITY - TEMPLATES, KEY RESIDUES AND STRUCTURE PREDICTION [J].
OVERINGTON, J ;
JOHNSON, MS ;
SALI, A ;
BLUNDELL, TL .
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 1990, 241 (1301) :132-145
[36]   H-1-NMR SOLUTION STRUCTURE OF AN ACTIVE MONOMERIC INTERLEUKIN-8 [J].
RAJARATHNAM, K ;
CLARKLEWIS, I ;
SYKES, BD .
BIOCHEMISTRY, 1995, 34 (40) :12983-12990
[37]  
Reddy BVB, 2001, PROTEINS, V42, P148, DOI 10.1002/1097-0134(20010201)42:2<148::AID-PROT20>3.0.CO
[38]  
2-R
[39]   Fold and function predictions for Mycoplasma genitalium proteins [J].
Rychlewski, L ;
Zhang, BH ;
Godzik, A .
FOLDING & DESIGN, 1998, 3 (04) :229-238
[40]   COMPARATIVE PROTEIN MODELING BY SATISFACTION OF SPATIAL RESTRAINTS [J].
SALI, A ;
BLUNDELL, TL .
JOURNAL OF MOLECULAR BIOLOGY, 1993, 234 (03) :779-815