HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins

被引:215
作者
Bystroff, C [1 ]
Thorsson, V
Baker, D
机构
[1] Rensselaer Polytech Inst, Dept Biol, Troy, NY 12180 USA
[2] Univ Washington, Dept Biochem, Seattle, WA 98195 USA
[3] Univ Washington, Dept Mol Biotechnol, Seattle, WA 98195 USA
基金
美国国家科学基金会;
关键词
hidden Markov models; I-sites Library; sequence patterns; motifs; clustering;
D O I
10.1006/jmbi.2000.3837
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a hidden Markov model, HMMSTR, for general protein sequence based on the I-sites library of sequence-structure motifs. Unlike the Linear hidden Markov models used to model individual protein families, HMMSTR has a highly branched topology and captures recurrent local features of protein sequences and structures that transcend protein family boundaries. The model extends the I-sites library by describing the adjacencies of different sequence-structure motifs as observed in the protein database and, by representing overlapping motifs in a much more compact form, achieves a great reduction in parameters. The HMM attributes a considerably higher probability to coding sequence than does an equivalent dipeptide model, predicts secondary structure with an accuracy of 74.3 %, backbone torsion angles better than any previously reported method and the structural context of beta strands and turns with an accuracy that should be useful for tertiary structure prediction. (C) 2000 Academic Press.
引用
收藏
页码:173 / 190
页数:18
相关论文
共 37 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] ASAI K, 1993, COMPUT APPL BIOSCI, V9, P141
  • [3] Helix capping
    Aurora, R
    Rose, GD
    [J]. PROTEIN SCIENCE, 1998, 7 (01) : 21 - 38
  • [4] Bailey T L, 1994, Proc Int Conf Intell Syst Mol Biol, V2, P28
  • [5] Finding the genes in genomic DNA
    Burge, CB
    Karlin, S
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) : 346 - 354
  • [6] Prediction of local structure in proteins using a library of sequence-structure motifs
    Bystroff, C
    Baker, D
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1998, 281 (03) : 565 - 577
  • [7] Chou P Y, 1978, Adv Enzymol Relat Areas Mol Biol, V47, P45
  • [8] Protein design automation
    Dahiyat, BI
    Mayo, SL
    [J]. PROTEIN SCIENCE, 1996, 5 (05) : 895 - 903
  • [9] Di Francesco V, 1997, ISMB-97 - FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY, PROCEEDINGS, P100
  • [10] DOIG AJ, 1995, PROTEIN SCI, V4, P1325, DOI 10.1002/pro.5560040708