General framework for developing and evaluating database scoring algorithms using the TANDEM search engine

被引:171
作者
MacLean, Brendan
Eng, Jimmy K.
Beavis, Ronald C.
McIntosh, Martin
机构
[1] Fred Hutchinson Canc Res Ctr, Seattle, WA 98104 USA
[2] LabKey Software, LLC, Seattle, WA USA
[3] Beavis Informat Ltd, Winnipeg, MB, Canada
[4] Univ British Columbia, Vancouver, BC V5Z 1M9, Canada
关键词
D O I
10.1093/bioinformatics/btl379
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Tandem mass spectrometry (MS/MS) identifies protein sequences using database search engines, at the core of which is a score that measures the similarity between peptide MS/MS spectra and a protein sequence database. The TANDEM application was developed as a freely available database search engine for the proteomics research community. To extend TANDEM as a platform for further research on developing improved database scoring methods, we modified the software to allow users to redefine the scoring function and replace the native TANDEM scoring function while leaving the remaining core application intact. Redefinition is performed at run time so multiple scoring functions are available to be selected and applied from a single search engine binary. We introduce the implementation of the pluggable scoring algorithm and also provide implementations of two TANDEM compatible scoring functions, one previously described scoring function compatible with PeptideProphet and one very simple scoring function that quantitative researchers may use to begin their development. This extension builds on the open-source TANDEM project and will facilitate research into and dissemination of novel algorithms for matching MS/MS spectra to peptide sequences. The pluggable scoring schema is also compatible with related search applications P3 and Hunter, which are part of the X! suite of database matching algorithms. The pluggable scores and the X! suite of applications are all written in C++.
引用
收藏
页码:2830 / 2832
页数:3
相关论文
共 9 条
[1]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[2]   TANDEM: matching proteins with tandem mass spectra [J].
Craig, R ;
Beavis, RC .
BIOINFORMATICS, 2004, 20 (09) :1466-1467
[3]  
Desiere F, 2005, GENOME BIOL, V6
[4]   Open mass spectrometry search algorithm [J].
Geer, LY ;
Markey, SP ;
Kowalak, JA ;
Wagner, L ;
Xu, M ;
Maynard, DM ;
Yang, XY ;
Shi, WY ;
Bryant, SH .
JOURNAL OF PROTEOME RESEARCH, 2004, 3 (05) :958-964
[5]  
HONG E, SACCHAROMYCES GENOME
[6]   Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search [J].
Keller, A ;
Nesvizhskii, AI ;
Kolker, E ;
Aebersold, R .
ANALYTICAL CHEMISTRY, 2002, 74 (20) :5383-5392
[7]  
Keller Andrew, 2005, Mol Syst Biol, V1
[8]  
Zhang N, 2002, PROTEOMICS, V2, P1406, DOI 10.1002/1615-9861(200210)2:10<1406::AID-PROT1406>3.0.CO
[9]  
2-9