QuasiMotiFinder: protein annotation by searching for evolutionarily conserved motif-like patterns

被引:31
作者
Gutman, R [1 ]
Berezin, C [1 ]
Wollman, R [1 ]
Rosenberg, Y [1 ]
Ben-Tal, N [1 ]
机构
[1] Tel Aviv Univ, Dept Biochem, George S Wise Fac Life Sci, IL-69978 Ramat Aviv, Israel
关键词
D O I
10.1093/nar/gki496
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Sequence signature databases such as PROSITE, whichincludeaminoacidsegments that are indicative of a protein's function, are useful for protein annotation. Lamentably, the annotation is not always accurate. A signature may be falsely detected in a protein that does not carry out the associated function ( false positive prediction, FP) or may be overlooked in a protein that does carry out the function ( false negative prediction, FN). A new approach has emerged in which a signature is replaced with a sequence profile, calculated based on multiple sequence alignment (MSA) of homologous proteins that share the same function. This approach, which is superior to the simple pattern search, essentially searches with the sequence of the query protein against an MSA library. We suggest here an alternative approach, implemented in the QuasiMotiFinder web server (http://quasimotifinder.tau.ac.il/), which is based on a search with an MSA of homologous query proteins against the original PROSITE signatures. The explicit use of the average evolutionary conservation of the signature in the query proteins significantly reduces the rate of FP prediction compared with the simple pattern search. QuasiMotiFinder also has a reduced rate of FN prediction compared with simple pattern searches, since the traditional search for precise signatures has been replaced by a permissive search for signature-like patterns that are physicochemically similar to known signatures. Overall, QuasiMotiFinder and the profile search are comparable to each other in terms of performance. They are also complementary to each other in that signatures that are falsely detected in ( or overlooked by) one may be correctly detected by the other.
引用
收藏
页码:W255 / W261
页数:7
相关论文
共 16 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkp985, 10.1093/nar/gkr1065, 10.1093/nar/gkh121]
[3]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[4]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[5]  
BORK P, 1995, PROTEIN SCI, V4, P268
[6]  
Conover W. J., 1980, PRACTICAL NONPARAMET
[7]   The EMOTIF database [J].
Huang, JY ;
Brutlag, DL .
NUCLEIC ACIDS RESEARCH, 2001, 29 (01) :202-204
[8]   Recent improvements to the PROSITE database [J].
Hulo, N ;
Sigrist, CJA ;
Le Saux, V ;
Langendijk-Genevaux, PS ;
Bordoli, L ;
Gattiker, A ;
De Castro, E ;
Bucher, P ;
Bairoch, A .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D134-D137
[9]  
JONASSEN I, 2000, BIOINFORMATICS SEQUE
[10]   2 TYPES OF AMINO-ACID SUBSTITUTIONS IN PROTEIN EVOLUTION [J].
MIYATA, T ;
MIYAZAWA, S ;
YASUNAGA, T .
JOURNAL OF MOLECULAR EVOLUTION, 1979, 12 (03) :219-236