Three-stage prediction of protein β-sheets by neural networks, alignments and graph algorithms

被引:88
作者
Cheng, JL [1 ]
Baldi, P [1 ]
机构
[1] Univ Calif Irvine, Inst Genom & Bioinformat, Sch Informat & Comp Sci, Irvine, CA 92697 USA
关键词
D O I
10.1093/bioinformatics/bti1004
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein beta-sheets play a fundamental role in protein structure, function, evolution and bioengineering. Accurate prediction and assembly of protein beta-sheets, however, remains challenging because protein beta-sheets require formation of hydrogen bonds between linearly distant residues. Previous approaches for predicting beta-sheet topological features, such as beta-strand alignments, in general have not exploited the global covariation and constraints characteristic of beta-sheet architectures. Results: We propose a modular approach to the problem of predicting/assembling protein beta-sheets in a chain by integrating both local and global constraints in three steps. The first step uses recursive neural networks to predict pairing probabilities for all pairs of interstrand beta-residues from profile, secondary structure and solvent accessibility information. The second step applies dynamic programming techniques to these probabilities to derive binding pseudoenergies and optimal alignments between all pairs of beta-strands. Finally, the third step uses graph matching algorithms to predict the beta-sheet architecture of the protein by optimizing the global pseudoenergy while enforcing strong global beta-strand pairing constraints. The approach is evaluated using cross-validation methods on a large non-homologous dataset and yields significant improvements over previous methods.
引用
收藏
页码:I75 / I84
页数:10
相关论文
共 36 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], 2004, J MACH LEARN RES, DOI DOI 10.1162/153244304773936054
[3]  
[Anonymous], 2012, Introduction to protein structure
[4]  
[Anonymous], 2003, handbook of chemoinformatics from data to knowledge
[5]  
Asogawa M, 1997, ISMB-97 - FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY, PROCEEDINGS, P48
[6]  
Baldi P, 2000, Proc Int Conf Intell Syst Mol Biol, V8, P25
[7]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[8]   ICBS:: a database of interactions between protein chains mediated by β-sheet formation [J].
Dou, YM ;
Baisnée, PF ;
Pécout, Y ;
Nowick, J ;
Baldi, P .
BIOINFORMATICS, 2004, 20 (16) :2767-2777
[9]  
Even S., 1979, Graph Algorithms
[10]   Prediction of contact maps with neural networks and correlated mutations [J].
Fariselli, P ;
Olmea, O ;
Valencia, A ;
Casadio, R .
PROTEIN ENGINEERING, 2001, 14 (11) :835-843