CCHMM_PROF: a HMM-based coiled-coil predictor with evolutionary information

被引:31
作者
Bartoli, Lisa [1 ]
Fariselli, Piero [1 ]
Krogh, Anders [2 ]
Casadio, Rita [1 ]
机构
[1] Univ Bologna, Dept Biol, Biocomp Grp, I-40126 Bologna, Italy
[2] Univ Copenhagen, Dept Biol, Bioinformat Ctr, DK-2200 Copenhagen, Denmark
关键词
PROTEIN SECONDARY STRUCTURE; DATABASE; SEQUENCE; PROGRAM; DOMAINS; MOTIFS; VIRUS; CORE;
D O I
10.1093/bioinformatics/btp539
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The widespread coiled-coil structural motif in proteins is known to mediate a variety of biological interactions. Recognizing a coiled-coil containing sequence and locating its coiled-coil domains are key steps towards the determination of the protein structure and function. Different tools are available for predicting coiled-coil domains in protein sequences, including those based on position-specific score matrices and machine learning methods. Results: In this article, we introduce a hidden Markov model (CCHMM_PROF) that exploits the information contained in multiple sequence alignments (profiles) to predict coiled-coil regions. The new method discriminates coiled-coil sequences with an accuracy of 97% and achieves a true positive rate of 79% with only 1% of false positives. Furthermore, when predicting the location of coiled-coil segments in protein sequences, the method reaches an accuracy of 80% at the residue level and a best per-segment and per-protein efficiency of 81% and 80%, respectively. The results indicate that CCHMM_PROF outperforms all the existing tools and can be adopted for large-scale genome annotation.
引用
收藏
页码:2757 / 2763
页数:7
相关论文
共 33 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   The Universal Protein Resource (UniProt) [J].
Bairoch, Amos ;
Bougueleret, Lydie ;
Altairac, Severine ;
Amendolia, Valeria ;
Auchincloss, Andrea ;
Puy, Ghislaine Argoud ;
Axelsen, Kristian ;
Baratin, Delphine ;
Blatter, Marie-Claude ;
Boeckmann, Brigitte ;
Bollondi, Laurent ;
Boutet, Emmanuel ;
Quintaje, Silvia Braconi ;
Breuza, Lionel ;
Bridge, Alan ;
Saux, Virginie Bulliard-Le ;
decastro, Edouard ;
Ciampina, Luciane ;
Coral, Danielle ;
Coudert, Elisabeth ;
Cusin, Isabelle ;
David, Fabrice ;
Delbard, Gwennaelle ;
Dornevil, Dolnide ;
Duek-Roggli, Paula ;
Duvaud, Severine ;
Estreicher, Anne ;
Famiglietti, Livia ;
Farriol-Mathis, Nathalie ;
Ferro, Serenella ;
Feuermann, Marc ;
Gasteiger, Elisabeth ;
Gateau, Alain ;
Gehant, Sebastian ;
Gerritsen, Vivienne ;
Gos, Arnaud ;
Gruaz-Gumowski, Nadine ;
Hinz, Ursula ;
Hulo, Chantal ;
Hulo, Nicolas ;
Innocenti, Alessandro ;
James, Janet ;
Jain, Eric ;
Jimenez, Silvia ;
Jungo, Florence ;
Junker, Vivien ;
Keller, Guillaume ;
Lachaize, Corinne ;
Lane-Guermonprez, Lydie ;
Langendijk-Genevaux, Petra .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D190-D195
[3]   PREDICTING COILED COILS BY USE SF PAIRWISE RESIDUE CORRELATIONS [J].
BERGER, B ;
WILSON, DB ;
WOLF, E ;
TONCHEV, T ;
MILLA, M ;
KIM, PS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1995, 92 (18) :8259-8263
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   Three-dimensional solution structure of the 44 kDa ectodomain of SIV gp41 [J].
Caffrey, M ;
Cai, ML ;
Kaufman, J ;
Stahl, SJ ;
Wingfield, PT ;
Covell, DG ;
Gronenborn, AM ;
Clore, GM .
EMBO JOURNAL, 1998, 17 (16) :4572-4584
[6]   The ASTRAL Compendium in 2004 [J].
Chandonia, JM ;
Hon, G ;
Walker, NS ;
Lo Conte, L ;
Koehl, P ;
Levitt, M ;
Brenner, SE .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D189-D192
[7]   STRUCTURAL FEATURES IN THE HEPTAD SUBSTRUCTURE AND LONGER RANGE REPEATS OF 2-STRANDED ALPHA-FIBROUS PROTEINS [J].
CONWAY, JF ;
PARRY, DAD .
INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 1990, 12 (05) :328-334
[8]   An HMM model for coiled-coil domains and a comparison with PSSM-based predictions [J].
Delorenzi, M ;
Speed, T .
BIOINFORMATICS, 2002, 18 (04) :617-625
[9]  
Durbin R., 1998, BIOL SEQUENCE ANAL P
[10]   A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins [J].
Fariselli, P ;
Martelli, PL ;
Casadio, R .
BMC BIOINFORMATICS, 2005, 6 (Suppl 4)