Intensity-based protein identification by machine learning from a library of tandem mass spectra

被引:243
作者
Elias, JE
Gibbons, FD
King, OD
Roth, FP
Gygi, SP
机构
[1] Harvard Univ, Sch Med, Dept Biol Chem & Mol Pharmacol, Boston, MA 02115 USA
[2] Harvard Univ, Sch Med, Dept Cell Biol, Boston, MA 02115 USA
[3] Harvard Univ, Sch Med, Taplin Biol Mass Spectrometry Facil, Boston, MA 02115 USA
关键词
D O I
10.1038/nbt930
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Tandem mass spectrometry (MS/MS) has emerged as a cornerstone of proteomics owing in part to robust spectral interpretation algorithms(1-6). Widely used algorithms do not fully exploit the intensity patterns present in mass spectra. Here, we demonstrate that intensity pattern modeling improves peptide and protein identification from MS/MS spectra. We modeled fragment ion intensities using a machine-learning approach that estimates the likelihood of observed intensities given peptide and fragment attributes. From 1,000,000 spectra, we chose 27,000 with high-quality, nonredundant matches as training data. Using the same 27,000 spectra, intensity was similarly modeled with mismatched peptides. We used these two probabilistic models to compute the relative likelihood of an observed spectrum given that a candidate peptide is matched or mismatched. We used a 'decoy' proteome approach to estimate incorrect match frequency(7), and demonstrated that an intensity-based method reduces peptide identification error by 50-96% without any loss in sensitivity.
引用
收藏
页码:214 / 219
页数:6
相关论文
共 24 条
  • [1] Mass spectrometry in proteomics
    Aebersold, R
    Goodlett, DR
    [J]. CHEMICAL REVIEWS, 2001, 101 (02) : 269 - 295
  • [2] Mass spectrometry-based proteomics
    Aebersold, R
    Mann, M
    [J]. NATURE, 2003, 422 (6928) : 198 - 207
  • [3] Cleavage N-terminal to proline: Analysis of a database of peptide tandem mass spectra
    Breci, LA
    Tabb, DL
    Yates, JR
    Wysocki, VH
    [J]. ANALYTICAL CHEMISTRY, 2003, 75 (09) : 1963 - 1971
  • [4] TM Finder: A prediction program for transmembrane protein segments using a combination of hydrophobicity and nonpolar phase helicity scales
    Deber, CM
    Wang, C
    Liu, LP
    Prior, AS
    Agrawal, S
    Muskat, BL
    Cuticchia, AJ
    [J]. PROTEIN SCIENCE, 2001, 10 (01) : 212 - 219
  • [5] AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE
    ENG, JK
    MCCORMACK, AL
    YATES, JR
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) : 976 - 989
  • [6] A proteomic view of the Plasmodium falciparum life cycle
    Florens, L
    Washburn, MP
    Raine, JD
    Anthony, RM
    Grainger, M
    Haynes, JD
    Moch, JK
    Muster, N
    Sacci, JB
    Tabb, DL
    Witney, AA
    Wolters, D
    Wu, YM
    Gardner, MJ
    Holder, AA
    Sinden, RE
    Yates, JR
    Carucci, DJ
    [J]. NATURE, 2002, 419 (6906) : 520 - 526
  • [7] Gay S, 2002, PROTEOMICS, V2, P1374, DOI 10.1002/1615-9861(200210)2:10<1374::AID-PROT1374>3.0.CO
  • [8] 2-D
  • [9] Harrison AG, 1997, MASS SPECTROM REV, V16, P201, DOI 10.1002/(SICI)1098-2787(1997)16:4<201::AID-MAS3>3.0.CO
  • [10] 2-L