A combined transmembrane topology and signal peptide prediction method

被引:1776
作者
Käll, L
Krogh, A
Sonnhammer, ELL [1 ]
机构
[1] Karolinska Inst, Ctr Genomics & Bioinformat, SE-17177 Stockholm, Sweden
[2] Univ Copenhagen, Bioinformat Ctr, DK-2100 Copenhagen, Denmark
关键词
transmembrane protein; signal peptide; topology prediction; hidden Markov model; machine learning;
D O I
10.1016/j.jmb.2004.03.016
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An inherent problem in transmembrane protein topology prediction and signal peptide prediction is the high similarity between the hydrophobic regions of a transmembrane helix and that of a signal peptide, leading to cross-reaction between the two types of predictions. To improve predictions further, it is therefore important to make a predictor that aims to discriminate between the two classes. In addition, topology information can be gained when successfully predicting a signal Peptide leading a trans' membrane protein since it dictates that the N terminus of the mature protein must be on the non-cytoplasmic side of the membrane. Here, we present Phobius, a combined transmembrane protein topology and signal peptide predictor. The predictor is based on a hidden Markov model (HMM) that models the different sequence regions of a signal peptide and the different regions of a transmembrane protein in a series of interconnected states. Training was done on a newly assembled and curated dataset. Compared to TMHMM and SignalP, errors coming from cross-prediction between transmembrane segments and signal peptides were reduced substantially by Phobius. False classifications of signal peptides were reduced from 26.1% to 3.9% and false classifications of transmembrane helices were reduced from 19.0%, to 7.7%. Phobius was applied to the proteomes of Honzo sapiens and Escherichia coli. Here we also noted a drastic reduction of false classifications compared to TMHMM/SignalP, suggesting that Phobius is well suited for whole-genome annotation of signal peptides and transmembrane regions. The method is available at http://phobius.cgb.ki.se/ as well as at http://phobius.binf.ku.dk/ (C) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1027 / 1036
页数:10
相关论文
共 41 条
[1]   Comprehensive analysis of transmembrane topologies in prokaryotic genomes [J].
Arai, M ;
Ikeda, M ;
Shimizu, T .
GENE, 2003, 304 :77-86
[2]  
ARGOS P, 1982, EUR J BIOCHEM, V128, P565
[3]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[4]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[5]   Transmembrane helix predictions revisited [J].
Chen, CP ;
Kernytsky, A ;
Rost, B .
PROTEIN SCIENCE, 2002, 11 (12) :2774-2791
[6]  
Chou KC, 2001, PROTEINS, V42, P136, DOI 10.1002/1097-0134(20010101)42:1<136::AID-PROT130>3.0.CO
[7]  
2-F
[8]   Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method [J].
Cserzo, M ;
Wallin, E ;
Simon, I ;
vonHeijne, G ;
Elofsson, A .
PROTEIN ENGINEERING, 1997, 10 (06) :673-676
[9]  
HOBOHM U, 1992, PROTEIN SCI, V1, P409
[10]   TMPDB: a database of experimentally-characterized transmembrane topologies [J].
Ikeda, M ;
Arai, M ;
Okuno, T ;
Shimizu, T .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :406-409