Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification

被引:42
作者
Klammer, Aaron A. [1 ]
Reynolds, Sheila M. [2 ]
Bilmes, Jeff A. [2 ,3 ]
MacCoss, Michael J. [1 ]
Noble, William Stafford [1 ,3 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA
[3] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
10.1093/bioinformatics/btn189
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relies on the relative predictability of peptide fragmentation. Unfortunately, peptide fragmentation is complex and not fully understood, and what is understood is not always exploited by peptide identification algorithms. Results: We use a hybrid dynamic Bayesian network (DBN)/support vector machine (SVM) approach to address these two problems. We train a set of DBNs on high-confidence peptide-spectrum matches. These DBNs, known collectively as Riptide, comprise a probabilistic model of peptide fragmentation chemistry. Examination of the distributions learned by Riptide allows identification of new trends, such as prevalent a-ion fragmentation at peptide cleavage sites C-term to hydrophobic residues. In addition, Riptide can be used to produce likelihood scores that indicate whether a given peptide-spectrum match is correct. A vector of such scores is evaluated by an SVM, which produces a final score to be used in peptide identification. Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate.
引用
收藏
页码:I348 / I356
页数:9
相关论文
共 39 条
[1]  
Bafna V, 2001, Bioinformatics, V17 Suppl 1, pS13
[2]   Graphical model architectures for speech recognition [J].
Bilmes, JA ;
Bartels, C .
IEEE SIGNAL PROCESSING MAGAZINE, 2005, 22 (05) :89-100
[3]   De novo peptide sequencing via tandem mass spectrometry [J].
Dancík, V ;
Addona, TA ;
Clauser, KR ;
Vath, JE ;
Pevzner, PA .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) :327-342
[4]   Influence of peptide composition, gas-phase basicity, and chemical modification on fragmentation efficiency: Evidence for the mobile proton model [J].
Dongre, AR ;
Jones, JL ;
Somogyi, A ;
Wysocki, VH .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1996, 118 (35) :8365-8374
[5]   Intensity-based protein identification by machine learning from a library of tandem mass spectra [J].
Elias, JE ;
Gibbons, FD ;
King, OD ;
Roth, FP ;
Gygi, SP .
NATURE BIOTECHNOLOGY, 2004, 22 (02) :214-219
[6]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[7]  
Field HI, 2002, PROTEOMICS, V2, P36, DOI 10.1002/1615-9861(200201)2:1<36::AID-PROT36>3.3.CO
[8]  
2-N
[9]   PepNovo: De novo peptide sequencing via probabilistic network modeling [J].
Frank, A ;
Pevzner, P .
ANALYTICAL CHEMISTRY, 2005, 77 (04) :964-973
[10]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964