Analysis of the Resolution Limitations of Peptide Identification Algorithms

被引:46
作者
Colaert, Niklaas [1 ,2 ]
Degroeve, Sven [1 ,2 ]
Helsens, Kenny [1 ,2 ]
Martens, Lennart [1 ,2 ]
机构
[1] Univ Ghent VIB, Dept Med Prot Res, Ghent, Belgium
[2] Univ Ghent, Dept Biochem, B-9000 Ghent, Belgium
关键词
proteomics; bioinformatics; mass spectrometry; TANDEM MASS-SPECTRA; OPEN-SOURCE LIBRARY; PROTEIN IDENTIFICATION; SPECTROMETRY DATA; DECOY DATABASES; PROTEOMICS; STANDARD; PARSE; RATES;
D O I
10.1021/pr200913a
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Proteome identification using peptide-centric proteomics techniques is a routinely used analysis technique. One of the most powerful and popular methods for the identification of peptides from MS/MS spectra is protein database matching using search engines. Significance thresholding through false discovery rate (FDR) estimation by target/decoy searches is used to ensure the retention of predominantly confident assignments of MS/MS spectra to peptides. However, shortcomings have become apparent when such decoy searches are used to estimate the FDR. To study these shortcomings, we here introduce a novel kind of decoy database that contains isobaric mutated versions of the peptides that were identified in the original search. Because of the supervised way in which the entrapment sequences are generated, we call this a directed decoy database. Since the peptides found in our directed decoy database are thus specifically designed to look quite similar to the forward identifications, the limitations of the existing search algorithms in making correct calls in such strongly confusing situations can be analyzed. Interestingly, for the vast majority of confidently identified peptide identifications, a directed decoy peptide-to-spectrum match can be found that has a better or equal match score than the forward match score, highlighting an important issue in the interpretation of peptide identifications in present-day high-throughput proteomics.
引用
收藏
页码:5555 / 5561
页数:7
相关论文
共 33 条
[1]   Comparison of Novel Decoy Database Designs for Optimizing Protein Identification Searches Using ABRF sPRG2006 Standard MS/MS Data Sets [J].
Bianco, Luca ;
Mead, Jennifer A. ;
Bessant, Conrad .
JOURNAL OF PROTEOME RESEARCH, 2009, 8 (04) :1782-1791
[2]   Proteogenomics to discover the full coding content of genomes: A computational perspective [J].
Castellana, Natalie ;
Bafna, Vineet .
JOURNAL OF PROTEOMICS, 2010, 73 (11) :2124-2135
[3]   MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification [J].
Cox, Juergen ;
Mann, Matthias .
NATURE BIOTECHNOLOGY, 2008, 26 (12) :1367-1372
[4]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[5]  
Creasy DM, 2002, PROTEOMICS, V2, P1426, DOI 10.1002/1615-9861(200210)2:10<1426::AID-PROT1426>3.0.CO
[6]  
2-5
[7]   Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations [J].
Elias, JE ;
Haas, W ;
Faherty, BK ;
Gygi, SP .
NATURE METHODS, 2005, 2 (09) :667-675
[8]   Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214
[9]   Unbiased Statistical Analysis for Multi-Stage Proteomic Search Strategies [J].
Everett, Logan J. ;
Bierl, Charlene ;
Master, Stephen R. .
JOURNAL OF PROTEOME RESEARCH, 2010, 9 (02) :700-707
[10]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964