A study of combined structure/sequence profiles

被引:36
作者
Elofsson, A [1 ]
Fischer, D [1 ]
Rice, DW [1 ]
LeGrand, SM [1 ]
Eisenberg, D [1 ]
机构
[1] UNIV CALIF LOS ANGELES, US DOE, LAB STRUCT BIOL & MOL MED, INST MOL BIOL, LOS ANGELES, CA 90095 USA
来源
FOLDING & DESIGN | 1996年 / 1卷 / 06期
关键词
fold recognition; genetic algorithms; inverse protein folding; profile methods;
D O I
10.1016/S1359-0278(96)00061-2
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background: For genome sequencing projects to achieve their full impact on biology and medicine, each protein sequence must be identified with its three-dimensional structure. Fold assignment methods (also called profile and threading methods) attempt to assign sequences to known protein folds by computing the compatibility of sequence to fold. Results: We have extended profile methods for the detection of protein folds having structural similarity but low sequence similarity to sequence probes. Our extension combines sequence substitution tables with structural properties to form a combined profile. The structural properties used in this study include distances between residues, exposed areas, areas buried by polar atoms, and properties of the original three-dimensional profile method. We compared the performance of these combined profiles with different sequence matrices and with the original three-dimensional profile method. To determine the optimal gap penalties and weights used with these profiles, we employed a genetic algorithm. The performance of these combined profiles was tested by cross validation using independent test and training sets. Conclusions: These studies show that the combined profiles perform better than profiles based on either structural or sequence information alone. (C) Current Biology Ltd.
引用
收藏
页码:451 / 461
页数:11
相关论文
共 37 条
[1]   RECOGNITION OF DISTANTLY RELATED PROTEINS THROUGH ENERGY CALCULATIONS [J].
ABAGYAN, R ;
FRISHMAN, D ;
ARGOS, P .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1994, 19 (02) :132-140
[2]   A METHOD TO IDENTIFY PROTEIN SEQUENCES THAT FOLD INTO A KNOWN 3-DIMENSIONAL STRUCTURE [J].
BOWIE, JU ;
LUTHY, R ;
EISENBERG, D .
SCIENCE, 1991, 253 (5016) :164-170
[3]   AN EVOLUTIONARY APPROACH TO FOLDING SMALL ALPHA-HELICAL PROTEINS THAT USES SEQUENCE INFORMATION AND AN EMPIRICAL GUIDING FITNESS FUNCTION [J].
BOWIE, JU ;
EISENBERG, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (10) :4436-4440
[4]   AN EMPIRICAL ENERGY FUNCTION FOR THREADING PROTEIN-SEQUENCE THROUGH THE FOLDING MOTIF [J].
BRYANT, SH ;
LAWRENCE, CE .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 1993, 16 (01) :92-112
[5]  
DAYHOFF MO, 1983, METHOD ENZYMOL, V91, P524
[6]   LOCAL MOVES - AN EFFICIENT ALGORITHM FOR SIMULATION OF PROTEIN-FOLDING [J].
ELOFSSON, A ;
LEGRAND, SM ;
EISENBERG, D .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1995, 23 (01) :73-82
[7]   NEW PROGRAMS FOR PROTEIN TERTIARY STRUCTURE PREDICTION [J].
FETROW, JS ;
BRYANT, SH .
BIO-TECHNOLOGY, 1993, 11 (04) :479-484
[8]  
Fischer D, 1996, PROTEIN SCI, V5, P947
[9]  
Fischer D, 1996, Pac Symp Biocomput, P300
[10]   A 3D sequence-independent representation of the protein data bank [J].
Fischer, D ;
Tsai, CJ ;
Nussinov, R ;
Wolfson, H .
PROTEIN ENGINEERING, 1995, 8 (10) :981-997