A common open representation of mass spectrometry data and its application to proteomics research

被引:575
作者
Pedrioli, PGA
Eng, JK
Hubley, R
Vogelzang, M
Deutsch, EW
Raught, B
Pratt, B
Nilsson, E
Angeletti, RH
Apweiler, R
Cheung, K
Costello, CE
Hermjakob, H
Huang, S
Julian, RK
Kapp, E
McComb, ME
Oliver, SG
Omenn, G
Paton, NW
Simpson, R
Smith, R
Taylor, CF
Zhu, WM
Aebersold, R
机构
[1] Inst Syst Biol, Seattle, WA 98103 USA
[2] Insilicos LLC, Seattle, WA 98103 USA
[3] Albert Einstein Coll Med, Bronx, NY 10461 USA
[4] EMBL Outstn European Bioinformat Inst, Cambridge, England
[5] Yale Univ, Sch Med, Dept Anesthesiol, Ctr Med Informat, New Haven, CT 06520 USA
[6] Boston Univ, Sch Med, Boston, MA 02118 USA
[7] Lilly Res Labs, Indianapolis, IN 46285 USA
[8] Royal Melbourne Hosp, Ludwig Inst Canc Res, Joint Proteom Lab, Parkville, Vic 3050, Australia
[9] Royal Melbourne Hosp, Walter & Eliza Hall Inst Med Res, Parkville, Vic 3050, Australia
[10] Univ Manchester, Sch Biol Sci, Manchester M13 9PT, Lancs, England
[11] Univ Michigan, Sch Med, Ann Arbor, MI 48109 USA
[12] Univ Manchester, Dept Comp Sci, Manchester M13 9PL, Lancs, England
[13] Pacific NW Natl Lab, Div Biol Sci, Richland, WA 99352 USA
[14] Pacific NW Natl Lab, Environm Mol Sci Lab, Richland, WA 99352 USA
基金
英国生物技术与生命科学研究理事会;
关键词
D O I
10.1038/nbt1031
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.
引用
收藏
页码:1459 / 1466
页数:8
相关论文
共 17 条
[11]  
2-2
[12]   The need for a public proteomics repository [J].
Prince, JT ;
Carlson, MW ;
Wang, R ;
Lu, P ;
Marcotte, EM .
NATURE BIOTECHNOLOGY, 2004, 22 (04) :471-472
[13]   Shotgun collision-induced dissociation of peptides using a time of flight mass analyzer [J].
Purvine, S ;
Eppel, JT ;
Yi, EC ;
Goodlett, DR .
PROTEOMICS, 2003, 3 (06) :847-850
[14]   Cytoscape: A software environment for integrated models of biomolecular interaction networks [J].
Shannon, P ;
Markiel, A ;
Ozier, O ;
Baliga, NS ;
Wang, JT ;
Ramage, D ;
Amin, N ;
Schwikowski, B ;
Ideker, T .
GENOME RESEARCH, 2003, 13 (11) :2498-2504
[15]  
Spellman PT, 2002, GENOME BIOL, V3
[16]  
Zhang N, 2002, PROTEOMICS, V2, P1406, DOI 10.1002/1615-9861(200210)2:10<1406::AID-PROT1406>3.0.CO
[17]  
2-9