Protein topology recognition from secondary structure sequences: Application of the hidden Markov models to the alpha class proteins

被引:25
作者
DiFrancesco, V [1 ]
Garnier, J [1 ]
Munson, PJ [1 ]
机构
[1] NIH, FOGARTY INT CTR, BETHESDA, MD 20892 USA
关键词
protein topology recognition; protein secondary structure; hidden Markov models; obese gene product; interleukin-6;
D O I
10.1006/jmbi.1996.0874
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The three-dimensional fold of a protein is described by the organization of its secondary structure elements in 3D space, i.e. its ''topology''. We find that the protein topology can be recognized from the 1D sequence of secondary structure states of the residues alone. Automated recognition is facilitated by use of hidden Markov models (HMMs) to represent topology families of proteins. Such models can be trained on the experimentally observed secondary structure sequences of family members using well established algorithms. Here, we model various topology groups in the alpha class of proteins and identify, from a large database, those proteins having the topology described by each model. The correct topology family for protein secondary structure sequences could be recognized 12 out of 14 times. When the observed secondary structure sequences are replaced with predicted sequences recognition is still achievable 8 out of 14 times. The success rate for observed sequences indicates that our approach will become increasingly useful as the accuracy of secondary prediction algorithms is improved. Our study indicates that the HMMs are useful for protein topology recognition even when no detectable primary amino acid sequence similarity is present. To illustrate the potential utility of our method, protein topology recognition is attempted on leptin, the obese gene product, and the human interleukin-6 sequence, for which fold predictions have been previously published. (C) 1997 Academic Press Limited.
引用
收藏
页码:446 / 463
页数:18
相关论文
共 62 条