Predicting transmembrane beta-barrels in proteomes

被引:120
作者
Bigelow, HR
Petrey, DS
Liu, J
Przybylski, D
Rost, B
机构
[1] Columbia Univ, Dept Biochem & Mol Biophys, CUBIC, New York, NY 10032 USA
[2] Columbia Univ, Howard Hughes Med Inst, New York, NY 10032 USA
[3] Columbia Univ, Dept Biochem & Mol Biophys, NESG, New York, NY 10032 USA
[4] Columbia Univ, Dept Pharmacol, New York, NY 10032 USA
[5] Columbia Univ, Dept Phys, New York, NY 10027 USA
[6] Columbia Univ, Ctr Computat Biol & Bioinformat, New York, NY 10032 USA
关键词
D O I
10.1093/nar/gkh580
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Very few methods address the problem of predicting beta-barrel membrane proteins directly from sequence. One reason is that only very few high-resolution structures for transmembrane beta-barrel (TMB) proteins have been determined thus far. Here we introduced the design, statistics and results of a novel profile-based hidden Markov model for the prediction and discrimination of TMBs. The method carefully attempts to avoid over-fitting the sparse experimental data. While our model training and scoring procedures were very similar to a recently published work, the architecture and structure-based labelling were significantly different. In particular, we introduced a new definition of beta- hairpin motifs, explicit state modelling of transmembrane strands, and a log-odds whole-protein discrimination score. The resulting method reached an overall four-state (up-, down-strand, periplasmic-, outer-loop) accuracy as high as 86%. Furthermore, accurately discriminated TMB from non-TMB proteins (45% coverage at 100% accuracy). This high precision enabled the application to 72 entirely sequenced Gram-negative bacteria. We found over 164 previously uncharacterized TMB proteins at high confidence. Database searches did not implicate any of these proteins with membranes. We challenge that the vast majority of our 164 predictions will eventually be verified experimentally. All proteome predictions and the PROFtmb prediction method are available at http://www.rostlab.org/ services/PROFtmb/.
引用
收藏
页码:2566 / 2577
页数:12
相关论文
共 76 条
  • [1] [Anonymous], SCI AM
  • [2] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [3] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [4] The complete genome sequence of Escherichia coli K-12
    Blattner, FR
    Plunkett, G
    Bloch, CA
    Perna, NT
    Burland, V
    Riley, M
    ColladoVides, J
    Glasner, JD
    Rode, CK
    Mayhew, GF
    Gregor, J
    Davis, NW
    Kirkpatrick, HA
    Goeden, MA
    Rose, DJ
    Mau, B
    Shao, Y
    [J]. SCIENCE, 1997, 277 (5331) : 1453 - +
  • [5] Type I secretion and multidrug efflux: transport through the TolC channel-tunnel
    Buchanan, SK
    [J]. TRENDS IN BIOCHEMICAL SCIENCES, 2001, 26 (01) : 3 - 6
  • [6] Buchanan SK, 1999, NAT STRUCT BIOL, V6, P56
  • [7] PEP: Predictions for Entire Proteomes
    Carter, P
    Liu, JF
    Rost, B
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 410 - 413
  • [8] Chen Chien Peter, 2002, Appl Bioinformatics, V1, P21
  • [9] Transmembrane helix predictions revisited
    Chen, CP
    Kernytsky, A
    Rost, B
    [J]. PROTEIN SCIENCE, 2002, 11 (12) : 2774 - 2791
  • [10] Biochemical and biophysical characterization of OmpG: A monomeric porin
    Conlan, S
    Zhang, Y
    Cheley, S
    Bayley, H
    [J]. BIOCHEMISTRY, 2000, 39 (39) : 11845 - 11854