MacroSEQUEST: Efficient Candidate-Centric Searching and High-Resolution Correlation Analysis for Large-Scale Proteomics Data Sets

被引：19

作者：

Faherty, Brendan K. ^{[1
]}

Gerber, Scott A. ^{[1
,2
]}

机构：

[1] Dartmouth Med Sch, Dept Genet, Lebanon, NH 03756 USA

[2] Norris Cotton Canc Ctr, Lebanon, NH 03756 USA

来源：

ANALYTICAL CHEMISTRY | 2010年 / 82卷 / 16期

基金：

美国国家卫生研究院;

关键词：

PEPTIDE IDENTIFICATION; PROTEIN IDENTIFICATION; SEQUENCE DATABASES; MASS ACCURACY; TANDEM; ALGORITHM; SPEED;

D O I：

10.1021/ac100783x

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Modern mass spectrometers are now capable of producing tens of thousands of tandem mass (MS/MS) spectra per hour of operation, resulting in an ever-increasing burden on the computational tools required to translate these raw MS/MS spectra into peptide sequences. In the present work, we describe our efforts to improve the performance of one of the earliest and most commonly used algorithms, SEQUEST, through a wholesale redesign of its processing architecture. We call this new program MacroSEQUEST, which exhibits a dramatic improvement in processing speed by transiently indexing the array of MS/MS spectra prior to searching FASTA databases. We demonstrate the performance of MacroSEQUEST relative to a suite of other programs commonly encountered in proteomics research. We also extend the capability of SEQUEST by implementing a parameter in MacroSEQUEST that allows for scalable sparse arrays of experimental and theoretical spectra to be implemented for high-resolution correlation analysis and demonstrate the advantages of high-resolution MS/MS searching to the sensitivity of large-scale proteomics data sets.

引用

页码：6821 / 6829

页数：9

共 27 条

[1] Mass spectrometry-based proteomics [J].

Aebersold, R ;

Mann, M .

NATURE, 2003, 422 (6928) :198-207

[2] The effects of mass accuracy, data acquisition speed, and search algorithm choice on peptide identification rates in phosphoproteomics [J].

Bakalarski, Corey E. ;

Haas, Wilhelm ;

Dephoure, Noah E. ;

Gygi, Steven P. .

ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2007, 389 (05) :1409-1419

[3] Large-scale characterization of HeLa cell nuclear phosphoproteins [J].

Beausoleil, SA ;

Jedrychowski, M ;

Schwartz, D ;

Elias, JE ;

Villén, J ;

Li, JX ;

Cohn, MA ;

Cantley, LC ;

Gygi, SP .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (33) :12130-12135

[4] A method for reducing the time required to match protein sequences with tandem mass spectra [J].

Craig, R ;

Beavis, RC .

RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2003, 17 (20) :2310-2316

[5] TANDEM: matching proteins with tandem mass spectra [J].

Craig, R ;

Beavis, RC .

BIOINFORMATICS, 2004, 20 (09) :1466-1467

[6]

Edwards N, 2002, LECT NOTES COMPUT SC, V2452, P68

[7] Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations [J].

Elias, JE ;

Haas, W ;

Faherty, BK ;

Gygi, SP .

NATURE METHODS, 2005, 2 (09) :667-675

[8] Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].

Elias, Joshua E. ;

Gygi, Steven P. .

NATURE METHODS, 2007, 4 (03) :207-214

[9] A fast SEQUEST cross correlation algorithm [J].

Eng, Jimmy K. ;

Fischer, Bernd ;

Grossmann, Jonas ;

MacCoss, Michael J. .

JOURNAL OF PROTEOME RESEARCH, 2008, 7 (10) :4598-4602

[10] AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].

ENG, JK ;

MCCORMACK, AL ;

YATES, JR .

JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989

← 1 2 3 →