Three-stage prediction of protein β-sheets by neural networks, alignments and graph algorithms

被引:88
作者
Cheng, JL [1 ]
Baldi, P [1 ]
机构
[1] Univ Calif Irvine, Inst Genom & Bioinformat, Sch Informat & Comp Sci, Irvine, CA 92697 USA
关键词
D O I
10.1093/bioinformatics/bti1004
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein beta-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein beta-sheets, however, remains challenging because protein beta-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting beta-sheet topological features, such as beta-strand alignments, in general have not exploited the global covariation and constraints characteristic of beta-sheet architectures. Results: We propose a modular approach to the problem of predicting/assembling protein beta-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand beta-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of beta-strands. Finally, the third step uses graph matching algorithms to predict the beta-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global beta-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods.
引用
收藏
页码:I75 / I84
页数:10
相关论文
共 36 条
[21]   Modulating protein folding rates in vivo and in vitro by side-chain interactions between the parallel β strands of green fluorescent protein [J].
Merkel, JS ;
Regan, L .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2000, 275 (38) :29200-29206
[22]   UniqueProt: creating representative protein sequence sets [J].
Mika, S ;
Rost, B .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3789-3791
[23]   CONTEXT IS A MAJOR DETERMINANT OF BETA-SHEET PROPENSITY [J].
MINOR, DL ;
KIM, PS .
NATURE, 1994, 371 (6494) :264-267
[24]  
Pollastri G, 2002, Bioinformatics, V18 Suppl 1, pS62
[25]   Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles [J].
Pollastri, G ;
Przybylski, D ;
Rost, B ;
Baldi, P .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 47 (02) :228-235
[26]  
PUNTA M, 2005, IN PRESS FEBS
[27]   PREDICTION OF PROTEIN SECONDARY STRUCTURE AT BETTER THAN 70-PERCENT ACCURACY [J].
ROST, B ;
SANDER, C .
JOURNAL OF MOLECULAR BIOLOGY, 1993, 232 (02) :584-599
[28]  
Ruczinski I, 2002, PROTEINS, V48, P85, DOI 10.1002/prot.10123
[29]   Predicting interresidue contacts using templates and pathways [J].
Shao, Y ;
Bystroff, C .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2003, 53 :497-502
[30]   GUIDELINES FOR PROTEIN DESIGN - THE ENERGETICS OF BETA-SHEET SIDE-CHAIN INTERACTIONS [J].
SMITH, CK ;
REGAN, L .
SCIENCE, 1995, 270 (5238) :980-982