De novo protein design.: II.: Plasticity in sequence space

被引:61
作者
Koehl, P [1 ]
Levitt, M [1 ]
机构
[1] Stanford Univ, Dept Biol Struct, Stanford, CA 94305 USA
关键词
protein design; random energy model; sequence space; Monte Carlo; fold recognition;
D O I
10.1006/jmbi.1999.3212
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
It is generally accepted that many different protein sequences have similar folded structures, and that there is a relatively high probability that a new sequence possesses a previously observed fold. An indirect consequence of this is that protein design should define the sequence space accessible to a given structure, rather than providing a single optimized sequence. We have recently developed a new approach for protein sequence design, which optimizes the complete sequence of a protein based on the knowledge of its backbone structure, its amino acid composition and a physical energy function including van der Waals interactions, electrostatics, and environment free energy. The specificity of the designed sequence for its template backbone is imposed by keeping the amino acid composition fixed. Here, we show that our procedure converges in sequence space, albeit not to the native sequence of the protein. We observe that while polar residues are well conserved in our designed sequences, non-polar amino acids at the surface of a protein are often replaced by polar residues. The designed sequences provide a multiple alignment of sequences that all adopt the same three-dimensional fold. This alignment is used to derive a profile matrix for chicken triose phosphate isomerase, TIM. The matrix is found to recognize significantly the native sequence for TIM, as well as closely related sequences. Possible application of this approach to protein fold recognition is discussed. (C) 1999 Academic Press.
引用
收藏
页码:1183 / 1193
页数:11
相关论文
共 60 条
[1]  
Agrafiotis DK, 1997, PROTEIN SCI, V6, P287
[2]   THE SWISS-PROT PROTEIN-SEQUENCE DATA-BANK [J].
BAIROCH, A ;
BOECKMANN, B .
NUCLEIC ACIDS RESEARCH, 1991, 19 :2247-2248
[3]   The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 1999, 27 (01) :49-54
[4]  
Baldwin Enoch P., 1994, Current Opinion in Biotechnology, V5, P396, DOI 10.1016/0958-1669(94)90048-5
[5]   STRUCTURE OF CHICKEN MUSCLE TRIOSE PHOSPHATE ISOMERASE DETERMINED CRYSTALLOGRAPHICALLY AT 2.5A RESOLUTION USING AMINO-ACID SEQUENCE DATA [J].
BANNER, DW ;
BLOOMER, AC ;
PETSKO, GA ;
PHILLIPS, DC ;
POGSON, CI ;
WILSON, IA ;
CORRAN, PH ;
FURTH, AJ ;
MILMAN, JD ;
OFFORD, RE ;
PRIDDLE, JD ;
WALEY, SG .
NATURE, 1975, 255 (5510) :609-614
[6]   REFINED 1.8 ANGSTROM CRYSTAL-STRUCTURE OF THE LAMBDA-REPRESSOR OPERATOR COMPLEX [J].
BEAMER, LJ ;
PABO, CO .
JOURNAL OF MOLECULAR BIOLOGY, 1992, 227 (01) :177-196
[7]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[8]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[9]  
Bryant SH, 1996, PROTEINS, V26, P172
[10]   PROTEINS - 1000 FAMILIES FOR THE MOLECULAR BIOLOGIST [J].
CHOTHIA, C .
NATURE, 1992, 357 (6379) :543-544