Improving large-scale proteomics by clustering of mass spectrometry data

被引:151
作者
Beer, I [1 ]
Barnea, E
Ziv, T
Admon, A
机构
[1] IBM Corp, Haifa Res Lab, IL-31905 Haifa, Israel
[2] Technion Israel Inst Technol, Dept Biol, Smoler Prote Ctr, IL-32000 Haifa, Israel
关键词
clustering; peptides; tandem mass spectrometry;
D O I
10.1002/pmic.200300652
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Tandem mass spectrometry (MS/MS), coupled with liquid chromatography (LC), is a powerful tool for the analysis and comparison of complex protein and peptide mixtures. However, the extremely large amounts of data that result from the process are very complex and difficult to analyze. We show how the clustering of similar spectra from multiple LC-MS/MS runs can help in data management and improve the analysis of complex peptide mixtures. The major effect of spectrum clustering is the reduction of the huge amounts of data to a manageable size. As a result, analysis time is shorter and more data can be stored for further analysis. Furthermore, spectrum quality improvement allows the identification of more peptides with greater confidence, the comparison of complex peptide mixtures is facilitated, and the entire proteomics project is presented in concise form. Pep-Miner is an advanced software tool that implements these clustering-based applications. It proved useful in several comparative proteomics projects involving lung cancer cells and various other cell types. In one of these projects, Pep-Miner reduced 517 000 spectra to 20 900 clusters and identified 2518 peptides derived from 830 proteins. Clustering and identification lasted less than two hours on an IBM Thinkpad T23 computer (laptop). Pep-Miner's unique properties make it a very useful tool for large-scale shotgun proteomics projects.
引用
收藏
页码:950 / 960
页数:11
相关论文
共 35 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]  
Barnea E, 2002, EUR J IMMUNOL, V32, P213, DOI 10.1002/1521-4141(200201)32:1<213::AID-IMMU213>3.3.CO
[3]  
2-#
[4]   Clustering gene expression patterns [J].
Ben-Dor, A ;
Shamir, R ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) :281-297
[5]   MASS-SPECTROMETRY OF PEPTIDES AND PROTEINS [J].
BIEMANN, K .
ANNUAL REVIEW OF BIOCHEMISTRY, 1992, 61 :977-1010
[6]  
BUCHSBAUM S, 2003, IMMUNOGENETICS, V16, P16
[7]   Role of accurate mass measurement (±10 ppm) in protein identification strategies employing MS or MS MS and database searching [J].
Clauser, KR ;
Baker, P ;
Burlingame, AL .
ANALYTICAL CHEMISTRY, 1999, 71 (14) :2871-2882
[8]   De novo peptide sequencing via tandem mass spectrometry [J].
Dancík, V ;
Addona, TA ;
Clauser, KR ;
Vath, JE ;
Pevzner, PA .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) :327-342
[9]  
Duda R.O., 1973, PATTERN CLASSIFICATI, P216
[10]  
Eddes JS, 2002, PROTEOMICS, V2, P1097, DOI 10.1002/1615-9861(200209)2:9<1097::AID-PROT1097>3.0.CO