NLStradamus: a simple Hidden Markov Model for nuclear localization signal prediction

被引:465
作者
Ba, Alex N. Nguyen [1 ,2 ]
Pogoutse, Anastassia [1 ]
Provart, Nicholas [1 ,2 ]
Moses, Alan M. [1 ,2 ]
机构
[1] Univ Toronto, Dept Cell & Syst Biol, Toronto, ON, Canada
[2] Univ Toronto, Ctr Anal Genome Evolut & Funct, Toronto, ON, Canada
来源
BMC BIOINFORMATICS | 2009年 / 10卷
关键词
IMPORTIN-ALPHA; KARYOPHERIN ALPHA; PORE COMPLEX; TRANSPORT; RECOGNITION; DISCOVERY; PROTEINS; GENE; BETA;
D O I
10.1186/1471-2105-10-202
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Nuclear localization signals (NLSs) are stretches of residues within a protein that are important for the regulated nuclear import of the protein. Of the many import pathways that exist in yeast, the best characterized is termed the 'classical' NLS pathway. The classical NLS contains specific patterns of basic residues and computational methods have been designed to predict the location of these motifs on proteins. The consensus sequences, or patterns, for the other import pathways are less well-understood. Results: In this paper, we present an analysis of characterized NLSs in yeast, and find, despite the large number of nuclear import pathways, that NLSs seem to show similar patterns of amino acid residues. We test current prediction methods and observe a low true positive rate. We therefore suggest an approach using hidden Markov models (HMMs) to predict novel NLSs in proteins. We show that our method is able to consistently find 37% of the NLSs with a low false positive rate and that our method retains its true positive rate outside of the yeast data set used for the training parameters. Conclusion: Our implementation of this model, NLStradamus, is made available at: http://www.moseslab.csb.utoronto.ca/NLStradamus/
引用
收藏
页数:11
相关论文
共 27 条
[1]   A MAXIMIZATION TECHNIQUE OCCURRING IN STATISTICAL ANALYSIS OF PROBABILISTIC FUNCTIONS OF MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T ;
SOULES, G ;
WEISS, N .
ANNALS OF MATHEMATICAL STATISTICS, 1970, 41 (01) :164-&
[2]   Evaluation of gene structure prediction programs [J].
Burset, M ;
Guigo, R .
GENOMICS, 1996, 34 (03) :353-367
[3]   Finding nuclear localization signals [J].
Cokol, M ;
Nair, R ;
Rost, B .
EMBO REPORTS, 2000, 1 (05) :411-415
[4]   EXTENSIVE MUTAGENESIS OF THE NUCLEAR LOCATION SIGNAL OF SIMIAN VIRUS-40 LARGE-T ANTIGEN [J].
COLLEDGE, WH ;
RICHARDSON, WD ;
EDGE, MD ;
SMITH, AE .
MOLECULAR AND CELLULAR BIOLOGY, 1986, 6 (11) :4136-4139
[5]   Crystallographic analysis of the recognition of a nuclear localization signal by the nuclear import factor karyopherin α [J].
Conti, E ;
Uy, M ;
Leighton, L ;
Blobel, G ;
Kuriyan, J .
CELL, 1998, 94 (02) :193-204
[6]   Dyskeratosis congenita in all its forms [J].
Dokal, I .
BRITISH JOURNAL OF HAEMATOLOGY, 2000, 110 (04) :768-779
[7]  
Durbin R., 1998, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
[8]   Profile hidden Markov models [J].
Eddy, SR .
BIOINFORMATICS, 1998, 14 (09) :755-763
[9]   The Pfam protein families database [J].
Finn, Robert D. ;
Tate, John ;
Mistry, Jaina ;
Coggill, Penny C. ;
Sammut, Stephen John ;
Hotz, Hans-Rudolf ;
Ceric, Goran ;
Forslund, Kristoffer ;
Eddy, Sean R. ;
Sonnhammer, Erik L. L. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D281-D288
[10]   Structural basis of recognition of monopartite and bipartite nuclear localization sequences by mammalian importin-α [J].
Fontes, MRM ;
Teh, T ;
Kobe, B .
JOURNAL OF MOLECULAR BIOLOGY, 2000, 297 (05) :1183-1194