A fast SEQUEST cross correlation algorithm

被引:176
作者
Eng, Jimmy K. [1 ]
Fischer, Bernd [2 ]
Grossmann, Jonas [3 ]
MacCoss, Michael J. [1 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] ETH, Inst Computat Sci, Zurich, Switzerland
[3] ETH, Inst Plant Sci, CH-8092 Zurich, Switzerland
关键词
cross correlation; SEQUEST; tandem mass spectrometry; E-value;
D O I
10.1021/pr800420s
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The SEQUEST program was the first and remains one of the most widely used tools for assigning a peptide sequence within a database to a tandem mass spectrum. The cross correlation score is the primary score function implemented within SEQUEST and it is this score that makes the tool particularly sensitive. Unfortunately, this score is computationally expensive to calculate, and thus, to make the score manageable, SEQUEST uses a less sensitive but fast preliminary score and restricts the cross correlation to just the top 500 peptides returned by the preliminary score. Classically, the cross correlation score has been calculated using Fast Fourier Transforms (FFT) to generate the full correlation function. We describe an alternate method of calculating the cross correlation score that does not require FFTs and can be computed efficiently in a fraction of the time. The fast calculation allows all candidate peptides to be scored by the cross correlation function, potentially mitigating the need for the preliminary score, and enables an E-value significance calculation based on the cross correlation score distribution calculated on all candidate peptide sequences obtained from a sequence database.
引用
收藏
页码:4598 / 4602
页数:5
相关论文
共 28 条
[1]   SGD:: Saccharomyces Genome Database [J].
Cherry, JM ;
Adler, C ;
Ball, C ;
Chervitz, SA ;
Dwight, SS ;
Hester, ET ;
Jia, YK ;
Juvik, G ;
Roe, T ;
Schroeder, M ;
Weng, SA ;
Botstein, D .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :73-79
[2]   OLAV: Towards high-throughput tandem mass spectrometry data identification [J].
Colinge, J ;
Masselot, A ;
Giron, M ;
Dessingy, T ;
Magnin, J .
PROTEOMICS, 2003, 3 (08) :1454-1463
[3]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[4]   Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry [J].
Elias, Joshua E. ;
Gygi, Steven P. .
NATURE METHODS, 2007, 4 (03) :207-214
[5]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[6]   A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes [J].
Fenyö, D ;
Beavis, RC .
ANALYTICAL CHEMISTRY, 2003, 75 (04) :768-774
[7]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964
[8]   Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry [J].
Han, DK ;
Eng, J ;
Zhou, HL ;
Aebersold, R .
NATURE BIOTECHNOLOGY, 2001, 19 (10) :946-951
[9]   Intensity-based statistical scorer for tandem mass spectrometry [J].
Havilio, M ;
Haddad, Y ;
Smilansky, Z .
ANALYTICAL CHEMISTRY, 2003, 75 (03) :435-444
[10]   Assigning significance to peptides identified by tandem mass spectrometry using decoy databases [J].
Kaell, Lukas ;
Storey, John D. ;
MacCoss, Michael J. ;
Noble, William Stafford .
JOURNAL OF PROTEOME RESEARCH, 2008, 7 (01) :29-34