An improved method for the construction of decoy peptide MS/MS spectra suitable for the accurate estimation of false discovery rates

被引:21
作者
Ahrne, Erik [1 ]
Ohta, Yuki
Nikitin, Frederic
Scherl, Alexander [2 ]
Lisacek, Frederique
Mueller, Markus
机构
[1] Swiss Inst Bioinformat, Proteome Informat Grp, CMU, CH-1211 Geneva, Switzerland
[2] Ctr Med Univ Geneva, Biomed Prote Res Grp, Geneva, Switzerland
关键词
Bioinformatics; Decoy database; Electron transfer disssociation; False discovery rate; MS/MS; Spectrum library; LARGE-SCALE PROTEOMICS; PROTEIN IDENTIFICATION; LIBRARY SEARCH; MASS;
D O I
10.1002/pmic.201000665
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The relevance of libraries of annotated MS/MS spectra is growing with the amount of proteomic data generated in high-throughput experiments. These reference libraries provide a fast and accurate way to identify newly acquired MS/MS spectra. In the context of multiple hypotheses testing, the control of the number of false-positive identifications expected in the final result list by means of the calculation of the false discovery rate (FDR). In a classical sequence search where experimental MS/MS spectra are compared with the theoretical peptide spectra calculated from a sequence database, the FDR is estimated by searching randomized or decoy sequence databases. Despite on-going discussion on how exactly the FDR has to be calculated, this method is widely accepted in the proteomic community. Recently, similar approaches to control the FDR of spectrum library searches were discussed. We present in this paper a detailed analysis of the similarity between spectra of distinct peptides to set the basis of our own solution for decoy library creation (DeLiberator). It differs from the previously published results in some key points, mainly in implementing new methods that prevent decoy spectra from being too similar to the original library spectra while keeping important features of real MS/MS spectra. Using different proteomic data sets and library creation methods, we evaluate our approach and compare it with alternative methods.
引用
收藏
页码:4085 / 4095
页数:11
相关论文
共 41 条
[1]   QuickMod: A Tool for Open Modification Spectrum Library Searches [J].
Ahrne, Erik ;
Nikitin, Frederic ;
Lisacek, Frederique ;
Mueller, Markus .
JOURNAL OF PROTEOME RESEARCH, 2011, 10 (07) :2913-2921
[2]   Unrestricted identification of modified proteins using MS/MS [J].
Ahrne, Erik ;
Mueller, Markus ;
Lisacek, Frederique .
PROTEOMICS, 2010, 10 (04) :671-686
[3]   A simple workflow to increase MS2 identification rate by subsequent spectral library search [J].
Ahrne, Erik ;
Masselot, Alexandre ;
Binz, Pierre-Alain ;
Mueller, Markus ;
Lisacek, Frederique .
PROTEOMICS, 2009, 9 (06) :1731-1736
[4]  
Arnold Randy J, 2006, Pac Symp Biocomput, P219
[5]   A probability-based approach for high-throughput protein phosphorylation analysis and site localization [J].
Beausoleil, Sean A. ;
Villen, Judit ;
Gerber, Scott A. ;
Rush, John ;
Gygi, Steven P. .
NATURE BIOTECHNOLOGY, 2006, 24 (10) :1285-1292
[6]   Improving large-scale proteomics by clustering of mass spectrometry data [J].
Beer, I ;
Barnea, E ;
Ziv, T ;
Admon, A .
PROTEOMICS, 2004, 4 (04) :950-960
[7]   Large Improvements in MS/MS-Based Peptide Identification Rates using a Hybrid Analysis [J].
Cannon, William R. ;
Rawlins, Mitchell M. ;
Baxter, Douglas J. ;
Callister, Stephen J. ;
Lipton, Mary S. ;
Bryant, Donald A. .
JOURNAL OF PROTEOME RESEARCH, 2011, 10 (05) :2306-2317
[8]   Analysis of phosphorylation sites on proteins from Saccharomyces cerevisiae by electron transfer dissociation (ETD) mass spectrometry [J].
Chi, An ;
Huttenhower, Curtis ;
Geer, Lewis Y. ;
Coon, Joshua J. ;
Syka, John E. P. ;
Bai, Dina L. ;
Shabanowitz, Jeffrey ;
Burke, Daniel J. ;
Troyanskaya, Olga G. ;
Hunt, Donald F. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (07) :2193-2198
[9]   OLAV: Towards high-throughput tandem mass spectrometry data identification [J].
Colinge, J ;
Masselot, A ;
Giron, M ;
Dessingy, T ;
Magnin, J .
PROTEOMICS, 2003, 3 (08) :1454-1463
[10]   Protein identification using sequential ion/ion reactions and tandem mass spectrometry [J].
Coon, JJ ;
Ueberheide, B ;
Syka, JEP ;
Dryhurst, DD ;
Ausio, J ;
Shabanowitz, J ;
Hunt, DF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (27) :9463-9468