Simultaneous alignment and clustering of peptide data using a Gibbs sampling approach

被引:94
作者
Andreatta, Massimo [1 ]
Lund, Ole [1 ]
Nielsen, Morten [1 ,2 ]
机构
[1] Tech Univ Denmark, Ctr Biol Sequence Anal, DK-2800 Lyngby, Denmark
[2] Univ San Martin, Inst Invest Biotecnol, RA-1650 Buenos Aires, DF, Argentina
关键词
MHC CLASS-I; SH3; DOMAINS; MULTIPLE-SPECIFICITY; RECOGNITION DOMAINS; BINDING; PREDICTION; STABILITY; MOLECULES; MOTIFS; DIVERSITY;
D O I
10.1093/bioinformatics/bts621
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Proteins recognizing short peptide fragments play a central role in cellular signaling. As a result of high-throughput technologies, peptide-binding protein specificities can be studied using large peptide libraries at dramatically lower cost and time. Interpretation of such large peptide datasets, however, is a complex task, especially when the data contain multiple receptor binding motifs, and/or the motifs are found at different locations within distinct peptides. Results: The algorithm presented in this article, based on Gibbs sampling, identifies multiple specificities in peptide data by performing two essential tasks simultaneously: alignment and clustering of peptide data. We apply the method to de-convolute binding motifs in a panel of peptide datasets with different degrees of complexity spanning from the simplest case of pre-aligned fixed-length peptides to cases of unaligned peptide datasets of variable length. Example applications described in this article include mixtures of binders to different MHC class I and class II alleles, distinct classes of ligands for SH3 domains and sub-specificities of the HLA-A*02:01 molecule.
引用
收藏
页码:8 / 14
页数:7
相关论文
共 33 条
[1]   NNAlign: A Web-Based Prediction Method Allowing Non-Expert End-User Discovery of Sequence Motifs in Quantitative Peptide Data [J].
Andreatta, Massimo ;
Schafer-Nielsen, Claus ;
Lund, Ole ;
Buus, Soren ;
Nielsen, Morten .
PLOS ONE, 2011, 6 (11)
[2]   A quantitative analysis of the variables affecting the repertoire of T cell specificities recognized after vaccinia virus infection [J].
Assarsson, Erika ;
Sidney, John ;
Oseroff, Carla ;
Pasquetto, Valerie ;
Bui, Huynh-Hoa ;
Frahm, Nicole ;
Brander, Christian ;
Peters, Bjoern ;
Grey, Howard ;
Sette, Alessandro .
JOURNAL OF IMMUNOLOGY, 2007, 178 (12) :7890-7901
[3]   MEME: discovering and analyzing DNA and protein sequence motifs [J].
Bailey, Timothy L. ;
Williams, Nadya ;
Misleh, Chris ;
Li, Wilfred W. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W369-W373
[4]  
BAILEY TL, 1995, MACH LEARN, V21, P51, DOI 10.1007/BF00993379
[5]   Progress in phage display: evolution of the technique and its applications [J].
Bratkovic, Tomaz .
CELLULAR AND MOLECULAR LIFE SCIENCES, 2010, 67 (05) :749-767
[6]  
Busch DH, 1998, J IMMUNOL, V160, P4441
[7]   The protein interaction network mediated by human SH3 domains [J].
Carducci, Martina ;
Perfetto, Livia ;
Briganti, Leonardo ;
Paoluzi, Serena ;
Costa, Stefano ;
Zerweck, Johannes ;
Schutkowski, Mike ;
Castagnoli, Luisa ;
Cesareni, Gianni .
BIOTECHNOLOGY ADVANCES, 2012, 30 (01) :4-15
[8]   Stability of peptide-HLA-I complexes and tapasin folding facilitation - tools to define immunogenic peptides [J].
Geironson, Linda ;
Roder, Gustav ;
Paulsson, Kajsa .
FEBS LETTERS, 2012, 586 (09) :1336-1343
[9]   Uncovering new aspects of protein interactions through analysis of specificity landscapes in peptide recognition domains [J].
Gfeller, David .
FEBS LETTERS, 2012, 586 (17) :2764-2772
[10]   The multiple-specificity landscape of modular peptide recognition domains [J].
Gfeller, David ;
Butty, Frank ;
Wierzbicka, Marta ;
Verschueren, Erik ;
Vanhee, Peter ;
Huang, Haiming ;
Ernst, Andreas ;
Dar, Nisa ;
Stagljar, Igor ;
Serrano, Luis ;
Sidhu, Sachdev S. ;
Bader, Gary D. ;
Kim, Philip M. .
MOLECULAR SYSTEMS BIOLOGY, 2011, 7