Implementation and application of a versatile clustering tool for tandem mass spectrometry data

被引:23
作者
Flikka, Kristian
Meukens, Jeroen
Helsensi, Kenny
Vandekerckhove, Joel
Eidhammer, Ingvar
Gevaert, Kris
Martens, Lennart
机构
[1] Univ Bergen, Bergen Ctr Computat Sci, Computat Biol Unit, N-5008 Bergen, Norway
[2] Univ Bergen, Proteom Unit, Bergen, Norway
[3] Univ Bergen, Dept Informat, N-5008 Bergen, Norway
[4] VIB, Dept Med Prot Res, Ghent, Belgium
[5] Univ Ghent, Dept Biochem, Ghent, Belgium
关键词
Bioinformatics; mass spectrometry; spectrum clustering;
D O I
10.1002/pmic.200700160
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss the potential of clustering and merging redundant spectra to turn this redundancy into a useful property of the dataset. To this end, we have created the first general-purpose, freely available open-source software application for clustering and merging MS/MS spectra. The application also introduces a novel approach to calculating the similarity of fragmentation mass spectra that takes into account the increased precision of modem mass spectrometers, and we suggest a simple but effective improvement to single-linkage clustering. The application and the novel algorithms are applied to several real-life proteomic datasets and the results are discussed. An analysis of the influence of the different algorithms available and their parameters is given, as well as a number of important applications of the overall approach.
引用
收藏
页码:3245 / 3258
页数:14
相关论文
共 37 条
[31]   Analysis of the cytosolic proteome of Halobacterium salinarum and its implication for genome annotation [J].
Tebbe, A ;
Klein, C ;
Bisle, B ;
Siedler, F ;
Scheffer, B ;
Garcia-Rizo, C ;
Wolfertz, J ;
Hickmann, V ;
Pfeiffer, F ;
Oesterhelt, D .
PROTEOMICS, 2005, 5 (01) :168-179
[32]   Identification of post-translational modifications by blind search of mass spectra [J].
Tsur, D ;
Tanner, S ;
Zandi, E ;
Bafna, V ;
Pevzner, PA .
NATURE BIOTECHNOLOGY, 2005, 23 (12) :1562-1567
[33]   Caspase-specific and nonspecific in vivo protein processing during Fas-induced apoptosis [J].
Van Damme, P ;
Martens, L ;
Van Damme, J ;
Hugelier, K ;
Staes, A ;
Vandekerckhove, J ;
Gevaert, K .
NATURE METHODS, 2005, 2 (10) :771-777
[34]   DBParser: Web-based software for shotgun proteomic data analyses [J].
Yang, XY ;
Dondeti, V ;
Dezube, R ;
Maynard, DM ;
Geer, LY ;
Epstein, J ;
Chen, XF ;
Markey, SP ;
Kowalak, JA .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :1002-1008
[35]   METHOD TO CORRELATE TANDEM MASS-SPECTRA OF MODIFIED PEPTIDES TO AMINO-ACID-SEQUENCES IN THE PROTEIN DATABASE [J].
YATES, JR ;
ENG, JK ;
MCCORMACK, AL ;
SCHIELTZ, D .
ANALYTICAL CHEMISTRY, 1995, 67 (08) :1426-1436
[36]   De novo peptide sequencing based on a divide-and-conquer algorithm and peptide tandem spectrum simulation [J].
Zheng, ZQ .
ANALYTICAL CHEMISTRY, 2004, 76 (21) :6374-6383
[37]   An algorithm for interpretation of low-energy collision-induced dissociation product ion spectra for de novo sequencing of peptides [J].
Zhong, HY ;
Li, L .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2005, 19 (08) :1084-1096