NNAlign: A Web-Based Prediction Method Allowing Non-Expert End-User Discovery of Sequence Motifs in Quantitative Peptide Data

被引:44
作者
Andreatta, Massimo [1 ]
Schafer-Nielsen, Claus [2 ]
Lund, Ole [1 ]
Buus, Soren [3 ]
Nielsen, Morten [1 ]
机构
[1] Tech Univ Denmark, Ctr Biol Sequence Anal, DK-2800 Lyngby, Denmark
[2] Schafer N, Copenhagen, Denmark
[3] Univ Copenhagen, Fac Hlth Sci, Expt Immunol Lab, Copenhagen, Denmark
来源
PLOS ONE | 2011年 / 6卷 / 11期
关键词
BINDING PREDICTIONS; SIGNAL PEPTIDES; CLASS-I; HLA-DR; MICROARRAYS; EPITOPE; IDENTIFICATION; GENERATION; AFFINITY; DNA;
D O I
10.1371/journal.pone.0026781
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advances in high-throughput technologies have made it possible to generate both gene and protein sequence data at an unprecedented rate and scale thereby enabling entirely new "omics"-based approaches towards the analysis of complex biological processes. However, the amount and complexity of data that even a single experiment can produce seriously challenges researchers with limited bioinformatics expertise, who need to handle, analyze and interpret the data before it can be understood in a biological context. Thus, there is an unmet need for tools allowing non-bioinformatics users to interpret large data sets. We have recently developed a method, NNAlign, which is generally applicable to any biological problem where quantitative peptide data is available. This method efficiently identifies underlying sequence patterns by simultaneously aligning peptide sequences and identifying motifs associated with quantitative readouts. Here, we provide a web-based implementation of NNAlign allowing non-expert end-users to submit their data (optionally adjusting method parameters), and in return receive a trained method (including a visual representation of the identified motif) that subsequently can be used as prediction method and applied to unknown proteins/peptides. We have successfully applied this method to several different data sets including peptide microarray-derived sets containing more than 100,000 data points. NNAlign is available online at http://www.cbs.dtu.dk/services/NNAlign.
引用
收藏
页数:11
相关论文
共 50 条
[1]   MEME: discovering and analyzing DNA and protein sequence motifs [J].
Bailey, Timothy L. ;
Williams, Nadya ;
Misleh, Chris ;
Li, Wilfred W. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W369-W373
[2]   The Universal Protein Resource (UniProt) [J].
Bairoch, Amos ;
Bougueleret, Lydie ;
Altairac, Severine ;
Amendolia, Valeria ;
Auchincloss, Andrea ;
Puy, Ghislaine Argoud ;
Axelsen, Kristian ;
Baratin, Delphine ;
Blatter, Marie-Claude ;
Boeckmann, Brigitte ;
Bollondi, Laurent ;
Boutet, Emmanuel ;
Quintaje, Silvia Braconi ;
Breuza, Lionel ;
Bridge, Alan ;
Saux, Virginie Bulliard-Le ;
decastro, Edouard ;
Ciampina, Luciane ;
Coral, Danielle ;
Coudert, Elisabeth ;
Cusin, Isabelle ;
David, Fabrice ;
Delbard, Gwennaelle ;
Dornevil, Dolnide ;
Duek-Roggli, Paula ;
Duvaud, Severine ;
Estreicher, Anne ;
Famiglietti, Livia ;
Farriol-Mathis, Nathalie ;
Ferro, Serenella ;
Feuermann, Marc ;
Gasteiger, Elisabeth ;
Gateau, Alain ;
Gehant, Sebastian ;
Gerritsen, Vivienne ;
Gos, Arnaud ;
Gruaz-Gumowski, Nadine ;
Hinz, Ursula ;
Hulo, Chantal ;
Hulo, Nicolas ;
Innocenti, Alessandro ;
James, Janet ;
Jain, Eric ;
Jimenez, Silvia ;
Jungo, Florence ;
Junker, Vivien ;
Keller, Guillaume ;
Lachaize, Corinne ;
Lane-Guermonprez, Lydie ;
Langendijk-Genevaux, Petra .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D190-D195
[3]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[4]   SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence [J].
Bhasin, M ;
Raghava, GPS .
BIOINFORMATICS, 2004, 20 (03) :421-423
[5]   Synthesis of photolabile 2-(2-nitrophenyl)propyloxycarbonyl protected amino acids [J].
Bhushan, KR ;
DeLisi, C ;
Laursen, RA .
TETRAHEDRON LETTERS, 2003, 44 (47) :8585-8588
[6]   Sequence and structure-based prediction of eukaryotic protein phosphorylation sites [J].
Blom, N ;
Gammeltoft, S ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 294 (05) :1351-1362
[7]   A dominant linear B-cell epitope of ricin A-chain is the target of a neutralizing antibody response in Hodgkin's lymphoma patients treated with an anti-CD25 immunotoxin [J].
Castelletti, D ;
Fracasso, G ;
Righetti, S ;
Tridente, G ;
Schnell, R ;
Engert, A ;
Colombatti, M .
CLINICAL AND EXPERIMENTAL IMMUNOLOGY, 2004, 136 (02) :365-372
[8]   Selecting informative data for developing peptide-MHC binding predictors using a query by committee approach [J].
Christensen, JK ;
Lamberth, K ;
Nielsen, M ;
Lundegaard, C ;
Worning, P ;
Lauemoller, SL ;
Buus, S ;
Brunak, S ;
Lund, O .
NEURAL COMPUTATION, 2003, 15 (12) :2931-2942
[9]   WebLogo: A sequence logo generator [J].
Crooks, GE ;
Hon, G ;
Chandonia, JM ;
Brenner, SE .
GENOME RESEARCH, 2004, 14 (06) :1188-1190
[10]   Data mining in bioinformatics using Weka [J].
Frank, E ;
Hall, M ;
Trigg, L ;
Holmes, G ;
Witten, IH .
BIOINFORMATICS, 2004, 20 (15) :2479-2481