The use of proteotypic peptide libraries for protein identification

被引:124
作者
Craig, R
Cortens, JP
Beavis, RC
机构
[1] Beavis Informat Ltd, Winnipeg, MB R3B 1G7, Canada
[2] Univ Manitoba, Manitoba Ctr Proteom, Winnipeg, MB R3T 2N2, Canada
关键词
D O I
10.1002/rcm.1992
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This paper describes an algorithm to apply proteotypic peptide sequence libraries to protein identifications performed using tandem mass spectrometry (MS/MS). Proteotypic peptides are those peptides in a protein sequence that are most likely to be confidently observed by current MS-based proteomics methods. Libraries of proteotypic peptide sequences were compiled from the Global Proteome Machine Database for Homo sapiens and Saccharomyces cerevisiae model species proteomes. These libraries were used to scan through collections of tandem mass spectra to discover which proteins were represented by the data sets, followed by detailed analysis of the spectra with the full protein sequences corresponding to the discovered proteotypic peptides. This algorithm (Proteotypic Peptide Profiling, or P3) resulted in sequence-to-spectrum matches comparable to those obtained by conventional protein identification algorithms using only full protein sequences, with a 20-fold reduction in the time required to perform the identification calculations. The proteotypic peptide libraries, the open source code for the implementation of the search algorithm and a website for using the software have been made freely available. Approximately 4% of the residues in the H. sapiens proteome were required in the proteotypic peptide library to successfully identify proteins. Copyright (c) 2005 John Wiley & Sons, Ltd.
引用
收藏
页码:1844 / 1850
页数:7
相关论文
共 33 条
[1]   Constellations in a cellular universe [J].
Aebersold, R .
NATURE, 2003, 422 (6928) :115-116
[2]   Mass spectrometry in proteomics [J].
Aebersold, R ;
Goodlett, DR .
CHEMICAL REVIEWS, 2001, 101 (02) :269-295
[3]   Cleavage N-terminal to proline: Analysis of a database of peptide tandem mass spectra [J].
Breci, LA ;
Tabb, DL ;
Yates, JR ;
Wysocki, VH .
ANALYTICAL CHEMISTRY, 2003, 75 (09) :1963-1971
[4]   THE ISOLATION OF PEPTIDES BY HIGH-PERFORMANCE LIQUID-CHROMATOGRAPHY USING PREDICTED ELUTION POSITIONS [J].
BROWNE, CA ;
BENNETT, HPJ ;
SOLOMON, S .
ANALYTICAL BIOCHEMISTRY, 1982, 124 (01) :201-208
[5]   Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms [J].
Christie, KR ;
Weng, S ;
Balakrishnan, R ;
Costanzo, MC ;
Dolinski, K ;
Dwight, SS ;
Engel, SR ;
Feierbach, B ;
Fisk, DG ;
Hirschman, JE ;
Hong, EL ;
Issel-Tarver, L ;
Nash, R ;
Sethuraman, A ;
Starr, B ;
Theesfeld, CL ;
Andrada, R ;
Binkley, G ;
Dong, Q ;
Lane, C ;
Schroeder, M ;
Botstein, D ;
Cherry, JM .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D311-D314
[6]   Open source system for analyzing, validating, and storing protein identification data [J].
Craig, R ;
Cortens, JP ;
Beavis, RC .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (06) :1234-1242
[7]   A method for reducing the time required to match protein sequences with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2003, 17 (20) :2310-2316
[8]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[9]  
Desiere F, 2005, GENOME BIOL, V6
[10]   From genomes to systems [J].
David I Ellis ;
Steve O'Hagan ;
Warwick B Dunn ;
Marie Brown ;
Seetharaman Vaidyanathan .
Genome Biology, 5 (11)