pNovo: De novo Peptide Sequencing and Identification Using HCD Spectra

被引:138
作者
Chi, Hao [1 ,2 ]
Sun, Rui-Xiang [1 ]
Yang, Bing [3 ]
Song, Chun-Qing [3 ]
Wang, Le-Heng [1 ]
Liu, Chao [1 ,2 ]
Fu, Yan [1 ]
Yuan, Zuo-Fei [1 ,2 ]
Wang, Hai-Peng [1 ,2 ]
He, Si-Min [1 ]
Dong, Meng-Qiu [3 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Grad Univ, Beijing 100049, Peoples R China
[3] Natl Inst Biol Sci, Beijing 102206, Peoples R China
基金
中国国家自然科学基金;
关键词
tandem mass spectrometry; HCD; de novo sequencing; pNovo; TANDEM MASS-SPECTROMETRY; COLLISION-INDUCED DISSOCIATION; PROTEIN IDENTIFICATION; AUTOMATED INTERPRETATION; SOFTWARE AID; SEARCH; DATABASE; ALGORITHM; MODEL; PERFORMANCE;
D O I
10.1021/pr100182k
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
De novo peptide sequencing has improved remarkably in the past decade as a result of better instruments and computational algorithms. However, de novo sequencing can correctly interpret only similar to 30% of high- and medium-quality spectra generated by collision-induced dissociation (CID), which is much less than database search. This is mainly due to incomplete fragmentation and overlap of different ion series in CID spectra. In this study, we show that higher-energy collisional dissociation (HCD) is of great help to de novo sequencing because it produces high mass accuracy tandem mass spectrometry (MS/MS) spectra without the low-mass cutoff associated with CID in ion trap instruments. Besides, abundant internal and immonium ions in the HCD spectra can help differentiate similar peptide sequences. Taking advantage of these characteristics, we developed an algorithm called pNovo for efficient de novo sequencing of peptides from HCD spectra. pNovo gave correct identifications to 80% or more of the HCD spectra identified by database search. The number of correct full-length peptides sequenced by pNovo is comparable with that obtained by database search. A distinct advantage of de novo sequencing is that deamidated peptides and peptides with amino acid mutations can be identified efficiently without extra cost in computation. In summary, implementation of the HCD characteristics makes pNovo an excellent tool for de novo peptide sequencing from HCD spectra.
引用
收藏
页码:2713 / 2724
页数:12
相关论文
共 59 条
[1]   EFFICIENT STRING MATCHING - AID TO BIBLIOGRAPHIC SEARCH [J].
AHO, AV ;
CORASICK, MJ .
COMMUNICATIONS OF THE ACM, 1975, 18 (06) :333-340
[2]   Protein sequence databases [J].
Apweiler, R ;
Bairoch, A ;
Wu, CH .
CURRENT OPINION IN CHEMICAL BIOLOGY, 2004, 8 (01) :76-80
[3]   Shotgun protein sequencing - Assembly of peptide tandem mass spectra from mixtures of modified proteins [J].
Bandeira, Nuno ;
Clauser, Karl R. ;
Pevzner, Pavel A. .
MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (07) :1123-1134
[4]   Straightforward and de Novo Peptide Sequencing by MALDI-MS/MS Using a Lys-N Metalloendopeptidase [J].
Boersema, Paul J. ;
Taouatas, Nadia ;
Altelaar, A. F. Maarten ;
Gouw, Joost W. ;
Ross, Philip L. ;
Pappin, Darryl J. ;
Heck, Albert J. R. ;
Mohammed, Shabaz .
MOLECULAR & CELLULAR PROTEOMICS, 2009, 8 (04) :650-660
[5]   A comparative study of the accuracy of several de novo sequencing software packages for datasets derived by matrix-assisted laser desorption/ionisation and electrospray [J].
Bringans, Scott ;
Kendrick, Tulene S. ;
Lui, James ;
Lipscombe, Richard .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2008, 22 (21) :3450-3454
[6]   A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry [J].
Chen, T ;
Kao, MY ;
Tepel, M ;
Rush, J ;
Church, GM .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (03) :325-337
[7]   High-performance peptide identification by tandem mass spectrometry allows reliable automatic data processing in proteomics [J].
Colinge, J ;
Masselot, A ;
Cusin, I ;
Mahé, E ;
Niknejad, A ;
Argoud-Puy, G ;
Reffas, S ;
Bederr, N ;
Gleizes, A ;
Rey, PA ;
Bougueleret, L .
PROTEOMICS, 2004, 4 (07) :1977-1984
[8]   OLAV: Towards high-throughput tandem mass spectrometry data identification [J].
Colinge, J ;
Masselot, A ;
Giron, M ;
Dessingy, T ;
Magnin, J .
PROTEOMICS, 2003, 3 (08) :1454-1463
[9]   A method for reducing the time required to match protein sequences with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2003, 17 (20) :2310-2316
[10]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467