New data base-independent, sequence tag-based scoring of peptide MS/MS data validates Mowse scores, recovers below threshold data, singles out modified peptides, and assesses the quality of MS/MS techniques

被引:76
作者
Savitski, MM [1 ]
Nielsen, ML [1 ]
Zubarev, RA [1 ]
机构
[1] Uppsala Univ, Lab Biol & Med Mass Spectrometry, S-75123 Uppsala, Sweden
关键词
D O I
10.1074/mcp.T500009-MCP200
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The Mascot score (M-score) is one of the conventional validity measures in data base identification of peptides and proteins by MS/MS data. Although tremendously useful, M-score has a number of limitations. For the same MS/MS data, M-score may change if the protein data base is expanded. A low M-value may not necessarily mean poor match but rather poor MS/MS quality. In addition M-score does not fully utilize the advantage of combined use of complementary fragmentation techniques collisionally activated dissociation ( CAD) and electron capture dissociation (ECD). To address these issues, a new data base-independent scoring method (S-score) was designed that is based on the maximum length of the peptide sequence tag provided by the combined CAD and ECD data. The quality of MS/MS spectra assessed by S-score allows poor data (39% of all MS/MS spectra) to be filtered out before the data base search, speeding up the data analysis and eliminating a major source of false positive identifications. Spectra with below threshold M-scores ( poor matches) but high S-scores are validated. Spectra with zero M-score ( no data base match) but high S-score are classified as belonging to modified sequences. As an extension of S-score, an extremely reliable sequence tag was developed based on complementary fragments simultaneously appearing in CAD and ECD spectra. Comparison of this tag with the data base-derived sequence gives the most reliable peptide identification validation to date. The combined use of M- and S-scoring provides positive sequence identification from > 25% of all MS/MS data, a 40% improvement over traditional M- scoring performed on the same Fourier transform MS instrumentation. The number of proteins reliably identified from Escherichia coli cell lysate hereby increased by 29% compared with the traditional M- score approach. Finally S-scoring provides a quantitative measure of the quality of fragmentation techniques such as the minimum abundance of the precursor ion, the MS/MS of which gives the threshold S-score value of 2.
引用
收藏
页码:1180 / 1188
页数:9
相关论文
共 27 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]   Automatic Quality Assessment of Peptide Tandem Mass Spectra [J].
Bern, Marshall ;
Goldberg, David ;
McDonald, W. Hayes ;
Yates, John R., III .
BIOINFORMATICS, 2004, 20 :49-54
[3]   Statistics review 7: Correlation and regression [J].
Bewick, V ;
Cheek, L ;
Ball, J .
CRITICAL CARE, 2003, 7 (06) :451-459
[4]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[5]   A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes [J].
Fenyö, D ;
Beavis, RC .
ANALYTICAL CHEMISTRY, 2003, 75 (04) :768-774
[6]   Preprocessing of tandem mass spectrometric data to support automatic protein identification [J].
Gentzel, M ;
Köcher, T ;
Ponnusamy, S ;
Wilm, M .
PROTEOMICS, 2003, 3 (08) :1597-1610
[7]   Automated de novo sequencing of proteins by tandem high-resolution mass spectrometry [J].
Horn, DM ;
Zubarev, RA ;
McLafferty, FW .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (19) :10313-10317
[8]   Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search [J].
Keller, A ;
Nesvizhskii, AI ;
Kolker, E ;
Aebersold, R .
ANALYTICAL CHEMISTRY, 2002, 74 (20) :5383-5392
[9]   Experimental Peptide Identification Repository (EPIR) - An integrated peptide-centric platform for validation and mining of tandem mass spectrometry data [J].
Kristensen, DB ;
Brond, JC ;
Nielsen, PA ;
Andersen, JR ;
Sorensen, OT ;
Jorgensen, V ;
Budin, K ;
Matthiesen, J ;
Veno, P ;
Jespersen, HM ;
Ahrens, CH ;
Schandorff, S ;
Ruhoff, PT ;
Wisniewski, JR ;
Bennett, KL ;
Podtelejnikov, AV .
MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (10) :1023-1038
[10]   ERROR TOLERANT IDENTIFICATION OF PEPTIDES IN SEQUENCE DATABASES BY PEPTIDE SEQUENCE TAGS [J].
MANN, M ;
WILM, M .
ANALYTICAL CHEMISTRY, 1994, 66 (24) :4390-4399