Prediction of protein domain boundaries from sequence alone

被引:43
作者
Galzitskaya, OV [1 ]
Melnik, BS [1 ]
机构
[1] Russian Acad Sci, Inst Prot Res, Pushchino 142290, Moscow Region, Russia
关键词
protein domain; latent entropy profile; degrees of freedom; domain database; superfamily;
D O I
10.1110/ps.0233103
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present here a simple approach to identify domain boundaries in proteins of an unknown three-dimensional structure. Our method is based on the hypothesis that a high-side chain entropy of a region in a protein chain must be compensated by a high-residue interaction energy within the region, which could correlate with a well-structured part of the globule, that is, with a domain unit. For protein domains, this means that the domain boundary is conditioned by amino acid residues with a small value of side chain entropy, which correlates with the side chain size. On the one hand, relatively high Ala and Gly content on the domain boundary results in high conformational entropy of the backbone chain between the domains. On the other hand, the presence of Pro residues leads to the formation of hinges for a relative orientation of domains. The method was applied to 646 proteins with two contiguous domains extracted from the SCOP database with a success rate of 63%. We also report the prediction of domain boundaries for CASP5 targets obtained with the same method.
引用
收藏
页码:696 / 701
页数:6
相关论文
共 19 条
[1]   Multiple domain protein diagnostic patterns [J].
Adams, RM ;
Das, S ;
Smith, TF .
PROTEIN SCIENCE, 1996, 5 (07) :1240-1249
[2]   Hierarchy of the interaction energy distribution in the spatial structure of globular proteins and the problem of domain definition [J].
Berezovsky, IN ;
Namiot, VA ;
Tumanyan, VG ;
Esipova, NG .
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 1999, 17 (01) :133-155
[3]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[4]   THE PREDICTION OF PROTEIN DOMAINS [J].
BUSETTA, B ;
BARRANS, Y .
BIOCHIMICA ET BIOPHYSICA ACTA, 1984, 790 (02) :117-124
[5]  
Galzitskaya OV, 2000, PROTEIN SCI, V9, P580
[6]  
GEORGE RA, 1992, J MOL BIOL, V316, P839
[7]   Whole genome protein domain analysis using a new method for domain clustering [J].
Gouzy, J ;
Corpet, F ;
Kahn, D .
COMPUTERS & CHEMISTRY, 1999, 23 (3-4) :333-340
[8]   Automated protein sequence database classification. II. Delineation of domain boundaries from sequence similarities [J].
Gracy, J ;
Argos, P .
BIOINFORMATICS, 1998, 14 (02) :174-187
[9]   Domain identification by clustering sequence alignments [J].
Guan, XJ ;
Du, L .
BIOINFORMATICS, 1998, 14 (09) :783-788
[10]   IDENTIFICATION AND ANALYSIS OF DOMAINS IN PROTEINS [J].
ISLAM, SA ;
LUO, JC ;
STERNBERG, MJE .
PROTEIN ENGINEERING, 1995, 8 (06) :513-525