MSNovo: A dynamic programming algorithm for de novo peptide sequencing via tandem mass spectrometry

被引:70
作者
Mo, Lijuan [1 ]
Dutta, Debojyoti [1 ]
Wan, Yunhu [1 ]
Chen, Ting [1 ]
机构
[1] Univ So Calif, Dept Biol, Dept Math, Los Angeles, CA 90089 USA
关键词
D O I
10.1021/ac070039n
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Tandem mass spectrometry (MS/MS) has become the experimental method of choice for high-throughput proteomics-based biological discovery. The two primary ways of analyzing MS/MS data are database search and de novo sequencing. In this paper, we present a new approach to peptide de novo sequencing, called MSNovo, which has the following advanced features. (1) It works on data generated from both LCQ and LTQ mass spectrometers and interprets singly, doubly, and triply charged ions. (2) It integrates a new probabilistic scoring function with a mass array-based dynamic programming algorithm. The simplicity of the scoring function, with only 6-10 parameters to be trained, avoids the problem of overfitting and allows MSNovo to be adopted for other machines and data sets easily. The mass array data structure explicitly encodes all possible peptides and allows the dynamic programming algorithm to find the best peptide. (3) Compared to existing programs, MSNovo predicts peptides as well as sequence tags with a higher accuracy, which is important for those applications that search protein databases using the de novo sequencing results. More specifically, we show that MSNovo outperforms other programs on various ESI ion trap data. We also show that for high-resolution data the performance of MSNovo improves significantly. Supporting Information, executable files and data sets can be found at http://msms.usc.edu/supplementary/msnovo.
引用
收藏
页码:4870 / 4878
页数:9
相关论文
共 53 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]   A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: Support vector machine classification of peptide MS/MS spectra and SEQUEST scores [J].
Anderson, DC ;
Li, WQ ;
Payan, DG ;
Noble, WS .
JOURNAL OF PROTEOME RESEARCH, 2003, 2 (02) :137-146
[3]  
[Anonymous], 2001, Bioinformatics
[4]   Shotgun protein sequencing by tandem mass spectra assembly [J].
Bandeira, N ;
Tang, HX ;
Bafna, V ;
Pevzner, P .
ANALYTICAL CHEMISTRY, 2004, 76 (24) :7221-7233
[5]  
BERN M, 2005, EIGENMS ANAL PEPTIDE
[6]   A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry [J].
Chen, T ;
Kao, MY ;
Tepel, M ;
Rush, J ;
Church, GM .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (03) :325-337
[7]   Algorithms for identifying protein cross-links via tandem mass spectrometry [J].
Chen, T ;
Jaffe, JD ;
Church, GM .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (06) :571-583
[8]   OLAV: Towards high-throughput tandem mass spectrometry data identification [J].
Colinge, J ;
Masselot, A ;
Giron, M ;
Dessingy, T ;
Magnin, J .
PROTEOMICS, 2003, 3 (08) :1454-1463
[9]  
Creasy DM, 2002, PROTEOMICS, V2, P1426, DOI 10.1002/1615-9861(200210)2:10<1426::AID-PROT1426>3.0.CO
[10]  
2-5