Adaptive Discriminant Function Analysis and Reranking of MS/MS Database Search Results for Improved Peptide Identification in Shotgun Proteomics
被引:35
作者:
Ding, Ying
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USAUniv Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Ding, Ying
[1
,2
]
Choi, Hyungwon
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USAUniv Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Choi, Hyungwon
[1
,2
]
Nesvizhskii, Alexey I.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Univ Michigan, Ctr Computat Biol & Med, Ann Arbor, MI 48109 USAUniv Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
Nesvizhskii, Alexey I.
[1
,3
]
机构:
[1] Univ Michigan, Dept Pathol, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[3] Univ Michigan, Ctr Computat Biol & Med, Ann Arbor, MI 48109 USA
tandem mass spectrometry;
database searching;
peptide identification;
statistical modeling;
adaptive discriminant analysis;
mass accuracy;
decoy sequences;
D O I:
10.1021/pr800484x
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Robust statistical validation of peptide identifications obtained by tandem mass spectrometry and sequence database searching is an important task in shotgun proteomics. PeptideProphet is a commonly used computational tool that computes confidence measures for peptide identifications. In this paper, we investigate several limitations of the PeptideProphet modeling approach, including the use of fixed coefficients in computing the discriminant search score and selection of the top scoring peptide assignment per spectrum only. To address these limitations, we describe an adaptive method in which a new discriminant function is learned from the data in an iterative fashion. We extend the modeling framework to go beyond the top scoring peptide assignment per spectrum. We also investigate the effect of clustering the spectra according to their spectrum quality score followed by cluster-specific mixture modeling. The analysis is carried out using data acquired from a mixture of purified proteins on four different types of mass spectrometers, as well as using a complex human serum data set. A special emphasis is placed on the analysis of data generated on high mass accuracy instruments.
机构:
Univ Michigan, Sch Med, Dept Pathol, Ctr Computat Med & Biol, Ann Arbor, MI 48105 USAUniv Michigan, Sch Med, Dept Pathol, Ctr Computat Med & Biol, Ann Arbor, MI 48105 USA
Nesvizhskii, Alexey I.
;
Vitek, Olga
论文数: 0引用数: 0
h-index: 0
机构:Univ Michigan, Sch Med, Dept Pathol, Ctr Computat Med & Biol, Ann Arbor, MI 48105 USA
机构:
Univ Michigan, Sch Med, Dept Pathol, Ctr Computat Med & Biol, Ann Arbor, MI 48105 USAUniv Michigan, Sch Med, Dept Pathol, Ctr Computat Med & Biol, Ann Arbor, MI 48105 USA
Nesvizhskii, Alexey I.
;
Vitek, Olga
论文数: 0引用数: 0
h-index: 0
机构:Univ Michigan, Sch Med, Dept Pathol, Ctr Computat Med & Biol, Ann Arbor, MI 48105 USA