Adaptive Discriminant Function Analysis and Reranking of MS/MS Database Search Results for Improved Peptide Identification in Shotgun Proteomics

被引:35
作者
Ding, Ying [1 ,2 ]
Choi, Hyungwon [1 ,2 ]
Nesvizhskii, Alexey I. [1 ,3 ]
机构
[1] Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Ctr Computat Biol & Med, Ann Arbor, MI 48109 USA
关键词
tandem mass spectrometry; database searching; peptide identification; statistical modeling; adaptive discriminant analysis; mass accuracy; decoy sequences;
D O I
10.1021/pr800484x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Robust statistical validation of peptide identifications obtained by tandem mass spectrometry and sequence database searching is an important task in shotgun proteomics. PeptideProphet is a commonly used computational tool that computes confidence measures for peptide identifications. In this paper, we investigate several limitations of the PeptideProphet modeling approach, including the use of fixed coefficients in computing the discriminant search score and selection of the top scoring peptide assignment per spectrum only. To address these limitations, we describe an adaptive method in which a new discriminant function is learned from the data in an iterative fashion. We extend the modeling framework to go beyond the top scoring peptide assignment per spectrum. We also investigate the effect of clustering the spectra according to their spectrum quality score followed by cluster-specific mixture modeling. The analysis is carried out using data acquired from a mixture of purified proteins on four different types of mass spectrometers, as well as using a complex human serum data set. A special emphasis is placed on the analysis of data generated on high mass accuracy instruments.
引用
收藏
页码:4878 / 4889
页数:12
相关论文
共 50 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]   A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: Support vector machine classification of peptide MS/MS spectra and SEQUEST scores [J].
Anderson, DC ;
Li, WQ ;
Payan, DG ;
Noble, WS .
JOURNAL OF PROTEOME RESEARCH, 2003, 2 (02) :137-146
[3]   The effects of mass accuracy, data acquisition speed, and search algorithm choice on peptide identification rates in phosphoproteomics [J].
Bakalarski, Corey E. ;
Haas, Wilhelm ;
Dephoure, Noah E. ;
Gygi, Steven P. .
ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2007, 389 (05) :1409-1419
[4]   Comparison of mascot and X!Tandem performance for low and high accuracy mass spectrometry and the development of an adjusted Mascot threshold [J].
Brosch, Markus ;
Swamy, Sajani ;
Hubbard, Tim ;
Choudhary, Jyoti .
MOLECULAR & CELLULAR PROTEOMICS, 2008, 7 (05) :962-970
[5]   The need for guidelines in publication of peptide and protein identification data - Working group on publication guidelines for peptide and protein identification data [J].
Carr, S ;
Aebersold, R ;
Baldwin, M ;
Burlingame, A ;
Clauser, K ;
Nesvizhskii, A .
MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (06) :531-533
[6]   Bioinformatic methods to exploit mass spectrometric data for proteomic applications [J].
Chalkley, RJ ;
Hansen, KC ;
Baldwin, MA .
BIOLOGICAL MASS SPECTROMETRY, 2005, 402 :289-312
[7]   Comprehensive analysis of a multidimensional liquid chromatography mass spectrometry dataset acquired on a quadrupole selecting, quadrupole collision cell, time-of-flight mass spectrometer - II. New developments in protein prospector allow for reliable and comprehensive automatic analysis of large datasets [J].
Chalkley, RJ ;
Baker, PR ;
Huang, L ;
Hansen, KC ;
Allen, NP ;
Rexach, M ;
Burlingame, AL .
MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (08) :1194-1204
[8]   False discovery rates and related statistical concepts in mass spectrometry-based proteomics [J].
Choi, Hyungwon ;
Nesvizhskii, Alexey I. .
JOURNAL OF PROTEOME RESEARCH, 2008, 7 (01) :47-50
[9]   Semisupervised model-based validation of peptide identifications in mass spectrometry-based proteomics [J].
Choi, Hyungwon ;
Nesvizhskii, Alexey I. .
JOURNAL OF PROTEOME RESEARCH, 2008, 7 (01) :254-265
[10]   Statistical validation of peptide identifications in large-scale proteomics using the target-decoy database search strategy and flexible mixture modeling [J].
Choi, Hyungwon ;
Ghosh, Debashis ;
Nesvizhskii, Alexey I. .
JOURNAL OF PROTEOME RESEARCH, 2008, 7 (01) :286-292