Efficient analysis and extraction of MS/MS result data from Mascot™ result files -: art. no. 290

被引:12
作者
Grosse-Coosmann, F [1 ]
Boehm, AM [1 ]
Sickmann, A [1 ]
机构
[1] Univ Wurzburg, Rudolf Virchow Ctr Expt Biomed, Prot Mass Spectrometry & Funct Proteom Grp, D-97078 Wurzburg, Germany
关键词
D O I
10.1186/1471-2105-6-290
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Mascot(TM) is a commonly used protein identification program for MS as well as for tandem MS data. When analyzing huge shotgun proteomics datasets with Mascot(TM)' s native tools, limits of computing resources are easily reached. Up to now no application has been available as open source that is capable of converting the full content of Mascot T result files from the original MIME format into a database-compatible tabular format, allowing direct import into database management systems and efficient handling of huge datasets analyzed by Mascot(TM). Results: A program called mres2x is presented, which reads Mascot(TM) result files, analyzes them and extracts either selected or all information in order to store it in a single file or multiple files in formats which are easier to handle downstream of Mascot(TM). It generates different output formats. The output of mres2x in tab format is especially designed for direct high-performance import into relational database management systems using native tools of these systems. Having the data available in database management systems allows complex queries and extensive analysis. In addition, the original peak lists can be extracted in DTA format suitable for protein identification using the Sequest(TM) program, and the Mascot(TM) files can be split, preserving the original data format. During conversion, several consistency checks are performed. mres2x is designed to provide high throughput processing combined with the possibility to be driven by other computer programs. The source code including supplement material and precompiled binaries is available via http:// www.protein-ms.deandhttp:// sourceforge.net/projects/protms/. Conclusion: The database upload allows regrouping of the MS/MS results using a database management system and complex analyzing queries using SQL without the need to run new Mascot(TM) searches when changing grouping parameters.
引用
收藏
页数:6
相关论文
共 19 条
[1]   Extractor for ESI quadrupole TOF tandem MS data enabled for high throughput batch processing [J].
Boehm, AM ;
Galvin, RP ;
Sickmann, A .
BMC BIOINFORMATICS, 2004, 5 (1)
[2]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[3]  
FREED N, 2045 RFC MIME 1
[4]  
FREED N, 2049 RFC MIME 5
[5]  
FREED N, 2048 RFC MIME 4
[6]  
FREED N, 2047 RFC MIME 3
[7]  
FREED N, 2046 RFC MIME 2
[8]  
KELLER A, 2005, UNIFORM PROTEOMICS M
[9]  
KERNIGHAN BW, 1990, C PROGRAMMING LANGUA
[10]  
MASINTER L, 2388 RFC