Prediction of error associated with false-positive rate determination for peptide identification in large-scale proteomics experiments using a combined reverse and forward peptide sequence database strategy

被引:60
作者
Huttlin, Edward L.
Hegeman, Adrian D.
Harms, Amy C.
Sussman, Michael R.
机构
[1] Univ Wisconsin, Ctr Biotechnol, Madison, WI 53706 USA
[2] Univ Wisconsin, Dept Biochem, Madison, WI 53706 USA
关键词
peptide identification; false-positive rate; false discovery rate; proteomics; data analysis; mass spectrometry; reversed database; decoy database;
D O I
10.1021/pr0603194
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In recent years, a variety of approaches have been developed using decoy databases to empirically assess the error associated with peptide identifications from large-scale proteomics experiments. We have developed an approach for calculating the expected uncertainty associated with false-positive rate determination using concatenated reverse and forward protein sequence databases. After explaining the theoretical basis of our model, we compare predicted error with the results of experiments characterizing a series of mixtures containing known proteins. In general, results from characterization of known proteins show good agreement with our predictions. Finally, we consider how these approaches may be applied to more complicated data sets, as when peptides are separated by charge state prior to false-positive determination.
引用
收藏
页码:392 / 398
页数:7
相关论文
共 10 条
[1]   Potential for false positive identifications from large databases through tandem mass spectrometry [J].
Cargile, BJ ;
Bundy, JL ;
Stephenson, JL .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :1082-1085
[2]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[3]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964
[4]   Randomized sequence databases for tandem mass spectrometry peptide and protein identification [J].
Higdon, R ;
Hogan, JM ;
Van Belle, G ;
Kolker, E .
OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2005, 9 (04) :364-379
[5]   Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search [J].
Keller, A ;
Nesvizhskii, AI ;
Kolker, E ;
Aebersold, R .
ANALYTICAL CHEMISTRY, 2002, 74 (20) :5383-5392
[6]   Probability-based validation of protein identifications using a modified SEQUEST algorithm [J].
MacCoss, MJ ;
Wu, CC ;
Yates, JR .
ANALYTICAL CHEMISTRY, 2002, 74 (21) :5593-5599
[7]   Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: The yeast proteome [J].
Peng, JM ;
Elias, JE ;
Thoreen, CC ;
Licklider, LJ ;
Gygi, SP .
JOURNAL OF PROTEOME RESEARCH, 2003, 2 (01) :43-50
[8]  
Perkins DN, 1999, ELECTROPHORESIS, V20, P3551, DOI 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO
[9]  
2-2
[10]   Probability-based evaluation of peptide and protein identifications from tandem mass spectrometry and SEQUEST analysis: The human proteome [J].
Qian, WJ ;
Liu, T ;
Monroe, ME ;
Strittmatter, EF ;
Jacobs, JM ;
Kangas, LJ ;
Petritis, K ;
Camp, DG ;
Smith, RD .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (01) :53-62