De novo peptide identification via tandem mass spectrometry and integer linear optimization

被引:36
作者
DiMaggio, Peter A., Jr. [1 ]
Floudas, Christodoulos A. [1 ]
机构
[1] Princeton Univ, Dept Chem Engn, Princeton, NJ 08544 USA
关键词
D O I
10.1021/ac0618425
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
A novel methodology for the automated de novo identification of peptides via integer linear optimization (also referred to as integer linear programming or ILP) and tandem mass spectrometry is presented in this article. The various features of the mathematical model are presented and examples are used to illustrate the key concepts of the proposed approach. A variety of challenging peptide identification problems, accompanied by a comparative study with five state-of-the-art methods, are examined to illustrate the proposed method's ability to address (a) residue-dependent fragmentation properties that result in missing ion peaks and (b) the variability of resolution in different mass analyzers. A preprocessing algorithm is utilized to identify important m/z values in the tandem mass spectrum. Missing peaks, due to residue-dependent fragmentation characteristics, are dealt with using a two-stage algorithmic framework. A cross-correlation approach is used to resolve missing amino acid assignments and to select the most probable peptide by comparing the theoretical spectra of the candidate sequences that were generated from the ILP sequencing stages with the experimental tandem mass spectrum. The novel, proposed de novo method, denoted as PILOT, is compared to existing popular methods such as Lutefisk, PEAKS, PepNovo, EigenMS, and NovoHMM for a set of spectra resulting from QTOF and ion trap instruments.
引用
收藏
页码:1433 / 1446
页数:14
相关论文
共 57 条
[31]   Probability-based validation of protein identifications using a modified SEQUEST algorithm [J].
MacCoss, MJ ;
Wu, CC ;
Yates, JR .
ANALYTICAL CHEMISTRY, 2002, 74 (21) :5593-5599
[32]   A role for Pareto optimality in mining performance data [J].
Malard, JM .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2005, 17 (01) :27-48
[33]  
MALARD JM, 2004, HICOMB P
[34]   ERROR TOLERANT IDENTIFICATION OF PEPTIDES IN SEQUENCE DATABASES BY PEPTIDE SEQUENCE TAGS [J].
MANN, M ;
WILM, M .
ANALYTICAL CHEMISTRY, 1994, 66 (24) :4390-4399
[35]   Qscore: An algorithm for evaluating SEQUEST database search results [J].
Moore, RE ;
Young, MK ;
Lee, TD .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2002, 13 (04) :378-386
[36]   Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MS [J].
Nesvizhskii, AI ;
Aebersold, R .
DRUG DISCOVERY TODAY, 2004, 9 (04) :173-181
[37]   THE INTERPRETATION OF COLLISION-INDUCED DISSOCIATION TANDEM MASS-SPECTRA OF PEPTIDES [J].
PAPAYANNOPOULOS, IA .
MASS SPECTROMETRY REVIEWS, 1995, 14 (01) :49-73
[38]   APROS - ALGORITHMIC DEVELOPMENT METHODOLOGY FOR DISCRETE-CONTINUOUS OPTIMIZATION PROBLEMS [J].
PAULES, GE ;
FLOUDAS, CA .
OPERATIONS RESEARCH, 1989, 37 (06) :902-915
[39]  
Perkins DN, 1999, ELECTROPHORESIS, V20, P3551, DOI 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO
[40]  
2-2