Sequence-based protein structure prediction using a reduced state-space hidden Markov model

被引:16
作者
Lampros, Christos
Papaloukas, Costas
Exarchos, Themis P.
Goletsis, Yorgos
Fotiadis, Dimitrios I.
机构
[1] Univ Ioannina, Dept Comp Sci, Unit Med Technol & Intelligent Informat Syst, GR-45110 Ioannina, Greece
[2] Univ Ioannina, Sch Med, Dept Phys Med, GR-45110 Ioannina, Greece
[3] Univ Ioannina, Dept Biol Applicat & Technol, GR-45110 Ioannina, Greece
[4] Univ Ioannina, Dept Econ, GR-45110 Ioannina, Greece
[5] FORTH, Inst Biomed Res, GR-45110 Ioannina, Greece
关键词
structure prediction; fold recognition; hidden Markov models; protein classification; SUPPORT VECTOR MACHINES; FOLD RECOGNITION; SECONDARY STRUCTURE; NEURAL NETWORKS; ALIGNMENT;
D O I
10.1016/j.compbiomed.2006.10.014
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This work describes the use of a hidden Markov model (HMM), with a reduced number of states, which simultaneously learns amino acid sequence and secondary structure for proteins of known three-dimensional structure and it is used for two tasks: protein class prediction and fold recognition. The Protein Data Bank and the annotation of the SCOP database are used for training and evaluation of the proposed HMM for a number of protein classes and folds. Results demonstrate that the reduced state-space HMM performs equivalently, or even better in some cases, on classifying proteins than a HMM trained with the amino acid sequence. The major advantage of the proposed approach is that a small number of states is employed and the training algorithm is of low complexity and thus relatively fast. (C) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1211 / 1224
页数:14
相关论文
共 26 条
[1]  
[Anonymous], 2004, IEEE T BIO-MED ENG, DOI DOI 10.1109/TBME.2004.826671
[2]  
Baum LE., 1972, INEQUALITIES, V3, P1
[3]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[4]   Identifying the tertiary fold of small proteins with different topologies from sequence and secondary structure using the genetic algorithm and extended criteria specific for strand regions [J].
Dandekar, T ;
Argos, P .
JOURNAL OF MOLECULAR BIOLOGY, 1996, 256 (03) :645-660
[5]   Protein topology recognition from secondary structure sequences: Application of the hidden Markov models to the alpha class proteins [J].
DiFrancesco, V ;
Garnier, J ;
Munson, PJ .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 267 (02) :446-463
[6]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358
[7]  
Durbin R., 1998, BIOL SEQUENCE ANAL P
[8]   Fold recognition by combining profile-profile alignment and support vector machine [J].
Han, SJ ;
Lee, BC ;
Yu, ST ;
Jeong, CS ;
Lee, S ;
Kim, D .
BIOINFORMATICS, 2005, 21 (11) :2667-2673
[9]  
Hargbo J, 1999, PROTEINS, V36, P68, DOI 10.1002/(SICI)1097-0134(19990701)36:1<68::AID-PROT6>3.3.CO
[10]  
2-T