Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data - Toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides

被引:151
作者
Nesvizhskii, AI
Roos, FF
Grossmann, J
Vogelzang, M
Eddes, JS
Gruissem, W
Baginsky, S
Aebersold, R
机构
[1] Inst Syst Biol, Seattle, WA 98103 USA
[2] Swiss Fed Inst Technol, CH-8092 Zurich, Switzerland
关键词
D O I
10.1074/mcp.M500319-MCP200
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In mass spectrometry-based proteomics, frequently hundreds of thousands of MS/MS spectra are collected in a single experiment. Of these, a relatively small fraction is confidently assigned to peptide sequences, whereas the majority of the spectra are not further analyzed. Spectra are not assigned to peptides for diverse reasons. These include deficiencies of the scoring schemes implemented in the database search tools, sequence variations (e.g. single nucleotide polymorphisms) or omissions in the database searched, post-translational or chemical modifications of the peptide analyzed, or the observation of sequences that are not anticipated from the genomic sequence (e.g. splice forms, somatic rearrangement, and processed proteins). To increase the amount of information that can be extracted from proteomic MS/MS datasets we developed a robust method that detects high quality spectra within the fraction of spectra unassigned by conventional sequence database searching and computes a quality score for each spectrum. We also demonstrate that iterative search strategies applied to such detected unassigned high quality spectra significantly increase the number of spectra that can be assigned from datasets and that biologically interesting new insights can be gained from existing data.
引用
收藏
页码:652 / 670
页数:19
相关论文
共 74 条
[11]  
Choudhary JS, 2001, PROTEOMICS, V1, P651, DOI 10.1002/1615-9861(200104)1:5<651::AID-PROT651>3.0.CO
[12]  
2-N
[13]   Role of accurate mass measurement (±10 ppm) in protein identification strategies employing MS or MS MS and database searching [J].
Clauser, KR ;
Baker, P ;
Burlingame, AL .
ANALYTICAL CHEMISTRY, 1999, 71 (14) :2871-2882
[14]   Experiments in searching small proteins in unannotated large eukaryotic genomes [J].
Colinge, J ;
Cusin, I ;
Reffas, S ;
Mahé, E ;
Niknejad, A ;
Rey, PA ;
Mattou, H ;
Moniatte, M ;
Bougueleret, L .
JOURNAL OF PROTEOME RESEARCH, 2005, 4 (01) :167-174
[15]   OLAV: Towards high-throughput tandem mass spectrometry data identification [J].
Colinge, J ;
Masselot, A ;
Giron, M ;
Dessingy, T ;
Magnin, J .
PROTEOMICS, 2003, 3 (08) :1454-1463
[16]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[17]  
CROFT WB, 2000, ADV INFORM RETRIEVAL, P1
[18]   De novo peptide sequencing via tandem mass spectrometry [J].
Dancík, V ;
Addona, TA ;
Clauser, KR ;
Vath, JE ;
Pevzner, PA .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1999, 6 (3-4) :327-342
[19]  
Desiere F, 2005, GENOME BIOL, V6
[20]  
EDWARD N, 2004, P 4 WORKSH ALG BIOIN, P230