MS2Grouper: Group assessment and synthetic replacement of duplicate proteomic tandem mass spectra

被引:44
作者
Tabb, DL
Thompson, MR
Khalsa-Moyers, G
VerBerkmoes, NC
McDonald, WH
机构
[1] Oak Ridge Natl Lab, Div Chem Sci, Oak Ridge, TN 37831 USA
[2] Univ Tennessee, Oak Ridge Natl Lab, Grad Sch, Oak Ridge, TN 37830 USA
基金
美国能源部;
关键词
D O I
10.1016/j.jasms.2005.04.010
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Shotgun proteomics experiments require the collection of thousands of tandem mass spectra; these sets of data will continue to grow as new instruments become available that can scan at even higher rates. Such data contain substantial amounts of redundancy with spectra from a particular peptide being acquired many times during a single LC-MS/MS experiment. In this article, we present MS2Grouper, an algorithm that detects spectral duplication, assesses groups of related spectra, and replaces these groups with synthetic representative spectra. Errors in detecting spectral similarity are corrected using a paraclique criterion-spectra are only assessed as groups if they are part of a clique of at least three completely interrelated spectra or are subsequently added to such cliques by being similar to all but one of the clique members. A greedy algorithm constructs a representative spectrum for each group by iteratively removing the tallest peaks from the spectral collection and matching to peaks in the other spectra. This strategy is shown to be effective in reducing spectral counts by up to 20% in LC-MS/MS datasets from protein standard mixtures and proteomes, reducing database search times without a concomitant reduction in identified peptides.
引用
收藏
页码:1250 / 1261
页数:12
相关论文
共 27 条
[1]   Improving large-scale proteomics by clustering of mass spectrometry data [J].
Beer, I ;
Barnea, E ;
Ziv, T ;
Admon, A .
PROTEOMICS, 2004, 4 (04) :950-960
[2]   FINDING ALL CLIQUES OF AN UNDIRECTED GRAPH [H] [J].
BRON, C ;
KERBOSCH, J .
COMMUNICATIONS OF THE ACM, 1973, 16 (09) :575-577
[3]   Strategies for shotgun identification of post-translational modifications by mass spectrometry [J].
Cantin, GT ;
Yates, JR .
JOURNAL OF CHROMATOGRAPHY A, 2004, 1053 (1-2) :7-14
[4]  
CHESLER EJ, 2005, IN PRESS NAT GENET
[5]  
Daraselia N., 2003, OMICS A Journal of Integrative Biology, V7, P171, DOI 10.1089/153623103322246566
[6]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[7]  
FRIDMAN T, 2005, IN PRESS BIOINFORMAT
[8]   Library search of mass spectra with a new matching algorithm based on substructure similarity [J].
Gan, F ;
Yang, JH ;
Liang, YZ .
ANALYTICAL SCIENCES, 2001, 17 (05) :635-638
[9]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964
[10]   SALSA: A pattern recognition algorithm to detect electrophile-adducted peptides by automated evaluation of CID spectra in LC-MS-MS analyses [J].
Hansen, BT ;
Jones, JA ;
Mason, DE ;
Liebler, DC .
ANALYTICAL CHEMISTRY, 2001, 73 (08) :1676-1683