MacroSEQUEST: Efficient Candidate-Centric Searching and High-Resolution Correlation Analysis for Large-Scale Proteomics Data Sets

被引:19
作者
Faherty, Brendan K. [1 ]
Gerber, Scott A. [1 ,2 ]
机构
[1] Dartmouth Med Sch, Dept Genet, Lebanon, NH 03756 USA
[2] Norris Cotton Canc Ctr, Lebanon, NH 03756 USA
基金
美国国家卫生研究院;
关键词
PEPTIDE IDENTIFICATION; PROTEIN IDENTIFICATION; SEQUENCE DATABASES; MASS ACCURACY; TANDEM; ALGORITHM; SPEED;
D O I
10.1021/ac100783x
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Modern mass spectrometers are now capable of producing tens of thousands of tandem mass (MS/MS) spectra per hour of operation, resulting in an ever-increasing burden on the computational tools required to translate these raw MS/MS spectra into peptide sequences. In the present work, we describe our efforts to improve the performance of one of the earliest and most commonly used algorithms, SEQUEST, through a wholesale redesign of its processing architecture. We call this new program MacroSEQUEST, which exhibits a dramatic improvement in processing speed by transiently indexing the array of MS/MS spectra prior to searching FASTA databases. We demonstrate the performance of MacroSEQUEST relative to a suite of other programs commonly encountered in proteomics research. We also extend the capability of SEQUEST by implementing a parameter in MacroSEQUEST that allows for scalable sparse arrays of experimental and theoretical spectra to be implemented for high-resolution correlation analysis and demonstrate the advantages of high-resolution MS/MS searching to the sensitivity of large-scale proteomics data sets.
引用
收藏
页码:6821 / 6829
页数:9
相关论文
共 27 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]   The effects of mass accuracy, data acquisition speed, and search algorithm choice on peptide identification rates in phosphoproteomics [J].
Bakalarski, Corey E. ;
Haas, Wilhelm ;
Dephoure, Noah E. ;
Gygi, Steven P. .
ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2007, 389 (05) :1409-1419
[3]   Large-scale characterization of HeLa cell nuclear phosphoproteins [J].
Beausoleil, SA ;
Jedrychowski, M ;
Schwartz, D ;
Elias, JE ;
Villén, J ;
Li, JX ;
Cohn, MA ;
Cantley, LC ;
Gygi, SP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (33) :12130-12135
[4]   A method for reducing the time required to match protein sequences with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2003, 17 (20) :2310-2316
[5]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[6]  
Edwards N, 2002, LECT NOTES COMPUT SC, V2452, P68
[7]   Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations [J].
Elias, JE ;
Haas, W ;
Faherty, BK ;
Gygi, SP .
NATURE METHODS, 2005, 2 (09) :667-675
[8]   Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214
[9]   A fast SEQUEST cross correlation algorithm [J].
Eng, Jimmy K. ;
Fischer, Bernd ;
Grossmann, Jonas ;
MacCoss, Michael J. .
JOURNAL OF PROTEOME RESEARCH, 2008, 7 (10) :4598-4602
[10]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989