A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence

被引:132
作者
Rice, DW [1 ]
Eisenberg, D [1 ]
机构
[1] UNIV CALIF LOS ANGELES,INST MOL BIOL,UCLA DOE LAB STRUCT BIOL & MOL BIOL,LOS ANGELES,CA 90095
关键词
protein fold recognition; structural alignment; substitution value; gap penalty; secondary structure prediction; AMINO-ACID-SEQUENCE; STRUCTURE ALIGNMENT; DATA-BANK; IDENTIFICATION; CLASSIFICATION; SIMILARITIES; CONFORMATION; POTENTIALS; TEMPLATES; DOMAINS;
D O I
10.1006/jmbi.1997.0924
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In protein fold recognition, a probe amino acid sequence is compared to a Library of representative folds of known structure to identify a structural homolog. In cases where the probe and its homolog have clear sequence similarity, traditional residue substitution matrices have been used to predict the structural similarity. In cases where the probe is sequentially distant from its homolog, we have developed a (7 x 3 x 2 x 7 x 3) 3D-1D substitution matrix (called H3P2), calculated from a database of 119 structural pairs. Members of each pair share a similar fold, but have sequence identity less than 30%. Each probe sequence position is defined by one of seven residue classes and three secondary structure classes. Each homologous fold position is defined by one of seven residue classes, three secondary structure classes, and two burial classes. Thus the matrix is five-dimensional and contains 7 x 3 x 2 x 7 x 3 = 882 elements or 3D-1D scores. The first step in assigning a probe sequence to its homologous fold is the prediction of the three-state (helix, strand, coil) secondary structure of the probe; here we use the profile based neural network prediction of secondary structure (PHD) program. Then a dynamic programming algorithm uses the H3P2 matrix to align the probe sequence with structures in a representative fold library. To test the effectiveness of the H3P2 matrix a challenging, fold class diverse, and cross-validated benchmark assessment is used to compare the H3P2 matrix to the GONNET, PAM250, BLOSUM62 and a secondary structure only substitution matrix. For distantly related sequences the H3P2 matrix detects more homologous structures at higher reliabilities than do these other substitution matrices, based on sensitivity versus specificity plots (or SENS-SPEC plots). The added efficacy of the H3P2 matrix arises from its information on the statistical preferences for various sequence-structure environment combinations from very distantly related proteins. It introduces the predicted secondary structure information from a sequence into fold recognition in a statistical way that normalizes the inherent correlations between residue type, secondary structure and solvent accessibility. (C) 1997 Academic Press Limited.
引用
收藏
页码:1026 / 1038
页数:13
相关论文
共 49 条
[1]  
[Anonymous], TRANSMISSION INFORMA
[2]   3-DIMENSIONAL STRUCTURAL RESEMBLANCE BETWEEN THE RIBONUCLEASE-H AND CONNECTION DOMAINS OF HIV REVERSE-TRANSCRIPTASE AND THE ATPASE FOLD REVEALED USING GRAPH-THEORETICAL TECHNIQUES [J].
ARTYMIUK, PJ ;
GRINDLEY, HM ;
KUMAR, K ;
RICE, DW ;
WILLETT, P .
FEBS LETTERS, 1993, 324 (01) :15-21
[3]   PROTEIN DATA BANK - COMPUTER-BASED ARCHIVAL FILE FOR MACROMOLECULAR STRUCTURES [J].
BERNSTEIN, FC ;
KOETZLE, TF ;
WILLIAMS, GJB ;
MEYER, EF ;
BRICE, MD ;
RODGERS, JR ;
KENNARD, O ;
SHIMANOUCHI, T ;
TASUMI, M .
JOURNAL OF MOLECULAR BIOLOGY, 1977, 112 (03) :535-542
[4]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[5]   AN EMPIRICAL ENERGY FUNCTION FOR THREADING PROTEIN-SEQUENCE THROUGH THE FOLDING MOTIF [J].
BRYANT, SH ;
LAWRENCE, CE .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 16 (01) :92-112
[6]  
Dayhoff R. M., 1972, Atlas of Protein Sequence and Structure, P89
[7]  
Doolittle R. F., 1986, URFS ORFS PRIMER ANA
[8]  
Fischer D, 1996, PROTEIN SCI, V5, P947
[9]  
FISCHER D, 1996, PAC S BIOC 96, V1
[10]   PROGRESS IN FOLD RECOGNITION [J].
FLOCKNER, H ;
BRAXENTHALER, M ;
LACKNER, P ;
JARITZ, M ;
ORTNER, M ;
SIPPL, MJ .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1995, 23 (03) :376-386