Interactome-wide prediction of short, disordered protein interaction motifs in humans

被引:22
作者
Edwards, Richard J. [1 ]
Davey, Norman E. [2 ]
O'Brien, Kevin [3 ,4 ]
Shields, Denis C. [3 ,4 ]
机构
[1] Univ Southampton, Ctr Biol Sci, Southampton SO9 5NH, Hants, England
[2] European Mol Biol Lab, Struct & Computat Biol Unit, D-69117 Heidelberg, Germany
[3] Univ Coll Dublin, UCD Complex & Adapt Syst Lab, Dublin, Ireland
[4] Univ Coll Dublin, UCD Conway Inst Biomol & Biomed Sci, Dublin, Ireland
基金
爱尔兰科学基金会; 英国生物技术与生命科学研究理事会;
关键词
SHORT LINEAR MOTIFS; MOLECULAR RECOGNITION FEATURES; INTERACTION DATABASE; INTERACTION NETWORKS; SEQUENCE MOTIFS; WEB SERVER; DISCOVERY; RESOURCE; CONSERVATION; UPDATE;
D O I
10.1039/c1mb05212h
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Many of the specific functions of intrinsically disordered protein segments are mediated by Short Linear Motifs (SLiMs) interacting with other proteins. Well known examples include SLiMs that interact with 14-3-3, PDZ, SH2, SH3, and WW domains but the true extent and diversity of SLiM-mediated interactions is largely unknown. Here, we attempt to expand our knowledge of human SLiMs by applying in silico SLiM prediction to the human interactome. Combining data from seven different interaction databases, we analysed approximately 6000 protein-centred and 1600 domain-centred human interaction datasets of 3+ unrelated proteins that interact with a common partner. Results were placed in context through comparison to randomised datasets of similar size and composition. The search returned thousands of evolutionarily conserved, intrinsically disordered occurrences of hundreds of significantly enriched recurring motifs, including many that have never been previously identified (http://bioware.soton.ac.uk/slimdb/). In addition to True Positive results for at least 25 different known SLiMs, a striking number of "off-target" proteins/domains also returned significantly enriched known motifs. Often, this was due to the non-independence of the datasets, with many proteins sharing interaction partners or contributing interactions to multiple domain datasets. The majority of these motif classes, however, were also found to be significantly enriched in one or more randomised datasets. This highlights the need for care when interpreting motif predictions of this nature but also raises the possibility that SLiM occurrences may be successfully identified independently of interaction data. Although not as compositionally biased as previous studies, patterns matching known SLiMs tended to cluster into a few large groups of similar sequence, while novel predictions tended to be more distinctive and less abundant. Whether this is due to ascertainment bias or a true functional composition bias of SLiMs is not clear and warrants further investigation.
引用
收藏
页码:282 / 295
页数:14
相关论文
共 51 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], CURRENT PROTOCOLS PR
  • [3] Characterization of protein hubs by inferring interacting motifs from protein interactions
    Aragues, Ramon
    Sali, Andrej
    Bonet, Jaume
    Marti-Renom, Marc A.
    Oliva, Baldo
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (09) : 1761 - 1771
  • [4] The Universal Protein Resource (UniProt) 2009
    Bairoch, Amos
    Consortium, UniProt
    Bougueleret, Lydie
    Altairac, Severine
    Amendolia, Valeria
    Auchincloss, Andrea
    Argoud-Puy, Ghislaine
    Axelsen, Kristian
    Baratin, Delphine
    Blatter, Marie-Claude
    Boeckmann, Brigitte
    Bolleman, Jerven
    Bollondi, Laurent
    Boutet, Emmanuel
    Quintaje, Silvia Braconi
    Breuza, Lionel
    Bridge, Alan
    deCastro, Edouard
    Ciapina, Luciane
    Coral, Danielle
    Coudert, Elisabeth
    Cusin, Isabelle
    Delbard, Gwennaelle
    Dornevil, Dolnide
    Roggli, Paula Duek
    Duvaud, Severine
    Estreicher, Anne
    Famiglietti, Livia
    Feuermann, Marc
    Gehant, Sebastian
    Farriol-Mathis, Nathalie
    Ferro, Serenella
    Gasteiger, Elisabeth
    Gateau, Alain
    Gerritsen, Vivienne
    Gos, Arnaud
    Gruaz-Gumowski, Nadine
    Hinz, Ursula
    Hulo, Chantal
    Hulo, Nicolas
    James, Janet
    Jimenez, Silvia
    Jungo, Florence
    Junker, Vivien
    Kappler, Thomas
    Keller, Guillaume
    Lachaize, Corinne
    Lane-Guermonprez, Lydie
    Langendijk-Genevaux, Petra
    Lara, Vicente
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D169 - D174
  • [5] The BioGRID interaction database:: 2008 update
    Breitkreutz, Bobby-Joe
    Stark, Chris
    Reguly, Teresa
    Boucher, Lorrie
    Breitkreutz, Ashton
    Livstone, Michael
    Oughtred, Rose
    Lackner, Daniel H.
    Bahler, Jurg
    Wood, Valerie
    Dolinski, Kara
    Tyers, Mike
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D637 - D640
  • [6] The HGNC Database in 2008: a resource for the human genome
    Bruford, Elspeth A.
    Lush, Michael J.
    Wright, Mathew W.
    Sneddon, Tam P.
    Povey, Sue
    Birney, Ewan
    [J]. NUCLEIC ACIDS RESEARCH, 2008, 36 : D445 - D448
  • [7] DOMINO: a database of domain-peptide interactions
    Ceol, Arnaud
    Chatr-aryamontri, Andrew
    Santonico, Elena
    Sacco, Roberto
    Castagnoli, Luisa
    Cesareni, Gianni
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D557 - D560
  • [8] MINT: the molecular INTeraction database
    Chatr-aryamontri, Andrew
    Ceol, Arnaud
    Palazzi, Luisa Montecchi
    Nardelli, Giuliano
    Schneider, Maria Victoria
    Castagnoli, Luisa
    Cesareni, Gianni
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : D572 - D574
  • [9] A tree-based conservation scoring method for short linear motifs in multiple alignments of protein sequences
    Chica, Claudia
    Labarga, Alberto
    Gould, Cathryn M.
    Lopez, Rodrigo
    Gibson, Toby J.
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)
  • [10] The SLiMDisc server: short, linear motif discovery in proteins
    Davey, Norman E.
    Edwards, Richard J.
    Shields, Denis C.
    [J]. NUCLEIC ACIDS RESEARCH, 2007, 35 : W455 - W459