A comprehensive assessment of N-terminal signal peptides prediction methods

被引:44
作者
Choo, Khar Heng [1 ,2 ]
Tan, Tin Wee [2 ]
Ranganathan, Shoba [2 ,3 ,4 ]
机构
[1] Inst Infocomm Res, Singapore 138632, Singapore
[2] Natl Univ Singapore, Yong Loo Lin Sch Med, Dept Biochem, Singapore 117597, Singapore
[3] Macquarie Univ, Dept Chem & Biomol Sci, Sydney, NSW 2109, Australia
[4] Macquarie Univ, ARC Ctr Excellence Bioinformat, Sydney, NSW 2109, Australia
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
SECRETED PROTEIN PREDICTION; SUBCELLULAR-LOCALIZATION; CLEAVAGE SITES; SEQUENCE; DATABASE; PROGRAM; IDENTIFICATION; SEARCH; TOOL;
D O I
10.1186/1471-2105-10-S15-S2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Amino-terminal signal peptides (SPs) are short regions that guide the targeting of secretory proteins to the correct subcellular compartments in the cell. They are cleaved off upon the passenger protein reaching its destination. The explosive growth in sequencing technologies has led to the deposition of vast numbers of protein sequences necessitating rapid functional annotation techniques, with subcellular localization being a key feature. Of the myriad software prediction tools developed to automate the task of assigning the SP cleavage site of these new sequences, we review here, the performance and reliability of commonly used SP prediction tools. Results: The available signal peptide data has been manually curated and organized into three datasets representing eukaryotes, Gram-positive and Gram-negative bacteria. These datasets are used to evaluate thirteen prediction tools that are publicly available. SignalP (both the HMM and ANN versions) maintains consistency and achieves the best overall accuracy in all three benchmarking experiments, ranging from 0.872 to 0.914 although other prediction tools are narrowing the performance gap. Conclusion: The majority of the tools evaluated in this study encounter no difficulty in discriminating between secretory and non-secretory proteins. The challenge clearly remains with pinpointing the correct SP cleavage site. The composite scoring schemes employed by SignalP may help to explain its accuracy. Prediction task is divided into a number of separate steps, thus allowing each score to tackle a particular aspect of the prediction.
引用
收藏
页数:12
相关论文
共 63 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Prediction of signal peptides in archaea [J].
Bagos, P. G. ;
Tsirigos, K. D. ;
Plessas, S. K. ;
Liakopoulos, T. D. ;
Hamodrakas, S. J. .
PROTEIN ENGINEERING DESIGN & SELECTION, 2009, 22 (01) :27-35
[4]   Swiss-Prot: Juggling between evolution and stability [J].
Bairoch, A ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E .
BRIEFINGS IN BIOINFORMATICS, 2004, 5 (01) :39-55
[5]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[6]   Extensive feature detection of N-terminal protein sorting signals [J].
Bannai, H ;
Tamada, Y ;
Maruyama, O ;
Nakai, K ;
Miyano, S .
BIOINFORMATICS, 2002, 18 (02) :298-305
[7]   Non-classical protein secretion in bacteria [J].
Bendtsen, JD ;
Kiemer, L ;
Fausboll, A ;
Brunak, S .
BMC MICROBIOLOGY, 2005, 5 (1)
[8]   Genome update:: prediction of secreted proteins in 225 bacterial proteomes [J].
Bendtsen, JD ;
Binnewies, TT ;
Hallin, PF ;
Sicheritz-Pontén, T ;
Ussery, DW .
MICROBIOLOGY-SGM, 2005, 151 :1725-1727
[9]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[10]   Prediction of subcellular localization using sequence-biased recurrent networks [J].
Bodén, M ;
Hawkins, J .
BIOINFORMATICS, 2005, 21 (10) :2279-2286