Persistently conserved positions in structurally similar, sequence dissimilar proteins: Roles in preserving protein fold and function

被引:48
作者
Friedberg, I [1 ]
Margalit, H [1 ]
机构
[1] Hebrew Univ Jerusalem, Hadassah Med Sch, Dept Mol Genet & Biotechnol, IL-91120 Jerusalem, Israel
关键词
molecular evolution; sequence conservation; protein structure; protein folding; bioinformatics;
D O I
10.1110/ps.18602
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Many protein pairs that share the same fold do not have any detectable sequence similarity, providing a valuable source of information for studying sequence-structure relationship. In this study, we use a stringent data set of structurally similar, sequence-dissimilar protein pairs to characterize residues that may play a role in the determination of protein structure and/or function. For each protein in the database, we identify amino-acid positions that show residue conservation within both close and distant family members. These positions are termed "persistently conserved". We then proceed to determine the "mutually" persistently conserved (MPC) positions: those structurally aligned positions in a protein pair that are persistently conserved in both pair mates. Because of their intra- and interfamily conservation, these positions are good candidates for determining protein fold and function. We find that 45% of the persistently conserved positions are mutually conserved. A significant fraction of them are located in critical positions for secondary structure determination, they are mostly buried, and many of them form spatial clusters within their protein structures. A substitution matrix based on the subset of MPC positions shows two distinct characteristics: (i) it is different from other available matrices, even those that are derived from structural alignments; (ii) its relative entropy is high, emphasizing the special residue restrictions imposed on these positions. Such a substitution matrix should be valuable for protein design experiments.
引用
收藏
页码:350 / 360
页数:11
相关论文
共 45 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Helix capping [J].
Aurora, R ;
Rose, GD .
PROTEIN SCIENCE, 1998, 7 (01) :21-38
[3]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[4]   Pairwise sequence alignment below the twilight zone [J].
Blake, JD ;
Cohen, FE .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 307 (02) :721-735
[5]   DECIPHERING THE MESSAGE IN PROTEIN SEQUENCES - TOLERANCE TO AMINO-ACID SUBSTITUTIONS [J].
BOWIE, JU ;
REIDHAAROLSON, JF ;
LIM, WA ;
SAUER, RT .
SCIENCE, 1990, 247 (4948) :1306-1310
[6]  
Brenner SE, 2000, PROTEIN SCI, V9, P197
[7]   Identification of kinetically hot residues in proteins [J].
Demirel, MC ;
Atilgan, AR ;
Jernigan, RL ;
Erman, B ;
Bahar, I .
PROTEIN SCIENCE, 1998, 7 (12) :2522-2532
[8]   Stabilization centers in proteins: Identification, characterization and predictions [J].
Dosztanyi, Z ;
Fiser, A ;
Simon, I .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 272 (04) :597-612
[9]  
Friedberg I, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P162
[10]   Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments [J].
Friedberg, I ;
Kaplan, T ;
Margalit, H .
PROTEIN SCIENCE, 2000, 9 (11) :2278-2284