A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences

被引:36
作者
Chica, Claudia [1 ]
Labarga, Alberto [2 ]
Gould, Cathryn M. [1 ]
Lopez, Rodrigo [2 ]
Gibson, Toby J. [1 ]
机构
[1] EMBL, Struct & Computat Biol Unit, D-69117 Heidelberg, Germany
[2] Wellcome Trust Genome Campus, EBI European Bioinformat Inst, Cambridge CB10 1SD, England
关键词
D O I
10.1186/1471-2105-9-229
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The structure of many eukaryotic cell regulatory proteins is highly modular. They are assembled from globular domains, segments of natively disordered polypeptides and short linear motifs. The latter are involved in protein interactions and formation of regulatory complexes. The function of such proteins, which may be difficult to define, is the aggregate of the subfunctions of the modules. It is therefore desirable to efficiently predict linear motifs with some degree of accuracy, yet sequence database searches return results that are not significant. Results: We have developed a method for scoring the conservation of linear motif instances. It requires only primary sequence-derived information (e. g. multiple alignment and sequence tree) and takes into account the degenerate nature of linear motif patterns. On our benchmarking, the method accurately scores 86% of the known positive instances, while distinguishing them from random matches in 78% of the cases. The conservation score is implemented as a real time application designed to be integrated into other tools. It is currently accessible via a Web Service or through a graphical interface. Conclusion: The conservation score improves the prediction of linear motifs, by discarding those matches that are unlikely to be functional because they have not been conserved during the evolution of the protein sequences. It is especially useful for instances in non-structured regions of the proteins, where a domain masking filtering strategy is not applicable.
引用
收藏
页数:12
相关论文
共 38 条
[21]   DILIMOT: discovery of linear motifs in proteins [J].
Neduva, Victor ;
Russell, Robert B. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W350-W355
[22]   Scansite 2.0: proteome-wide prediction of cell signaling interactions using short sequence motifs [J].
Obenauer, JC ;
Cantley, LC ;
Yaffe, MB .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3635-3641
[23]   UTOPIA - user-friendly tools for operating informatics applications [J].
Pettifer, SR ;
Sinnott, JR ;
Attwood, TK .
COMPARATIVE AND FUNCTIONAL GENOMICS, 2004, 5 (01) :56-60
[24]   ELM server:: a new resource for investigating short functional sites in modular eukaryotic proteins [J].
Puntervoll, P ;
Linding, R ;
Gemünd, C ;
Chabanis-Davidson, S ;
Mattingsdal, M ;
Cameron, S ;
Martin, DMA ;
Ausiello, G ;
Brannetti, B ;
Costantini, A ;
Ferrè, F ;
Maselli, V ;
Via, A ;
Cesareni, G ;
Diella, F ;
Superti-Furga, G ;
Wyrwicz, L ;
Ramu, C ;
McGuigan, C ;
Gudavalli, R ;
Letunic, I ;
Bork, P ;
Rychlewski, L ;
Küster, B ;
Helmer-Citterich, M ;
Hunter, WN ;
Aasland, R ;
Gibson, TJ .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3625-3630
[25]   A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene families [J].
Pupko, T ;
Pe'er, I ;
Hasegawa, M ;
Graur, D ;
Friedman, N .
BIOINFORMATICS, 2002, 18 (08) :1116-1123
[26]   SMART: a web-based tool for the study of genetically mobile domains [J].
Schultz, J ;
Copley, RR ;
Doerks, T ;
Ponting, CP ;
Bork, P .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :231-234
[27]   Reading protein modifications with interaction domains [J].
Seet, Bruce T. ;
Dikic, Ivan ;
Zhou, Ming-Ming ;
Pawson, Tony .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2006, 7 (07) :473-483
[28]  
SHANNON CE, 1948, BELL SYST TECH J, V27, P379, DOI [DOI 10.1002/J.1538-7305.1948.TB01338.X, DOI 10.1002/J.1538-7305.1948.TB00917.X]
[29]  
Sigrist Christian J A, 2002, Brief Bioinform, V3, P265, DOI 10.1093/bib/3.3.265
[30]   UniRef: comprehensive and non-redundant UniProt reference clusters [J].
Suzek, Baris E. ;
Huang, Hongzhan ;
McGarvey, Peter ;
Mazumder, Raja ;
Wu, Cathy H. .
BIOINFORMATICS, 2007, 23 (10) :1282-1288