Experimental Peptide Identification Repository (EPIR) - An integrated peptide-centric platform for validation and mining of tandem mass spectrometry data

被引:29
作者
Kristensen, DB [1 ]
Brond, JC [1 ]
Nielsen, PA [1 ]
Andersen, JR [1 ]
Sorensen, OT [1 ]
Jorgensen, V [1 ]
Budin, K [1 ]
Matthiesen, J [1 ]
Veno, P [1 ]
Jespersen, HM [1 ]
Ahrens, CH [1 ]
Schandorff, S [1 ]
Ruhoff, PT [1 ]
Wisniewski, JR [1 ]
Bennett, KL [1 ]
Podtelejnikov, AV [1 ]
机构
[1] MDS Inc, DK-5230 Odense M, Denmark
关键词
D O I
10.1074/mcp.T400004-MCP200
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
LC MS/MS has become an established technology in proteomic studies, and with the maturation of the technology the bottleneck has shifted from data generation to data validation and mining. To address this bottleneck we developed Experimental Peptide Identification Repository (EPIR), which is an integrated software platform for storage, validation, and mining of LC MS/MS-derived peptide evidence. EPIR is a cumulative data repository where precursor ions are linked to peptide assignments and protein associations returned by a search engine ( e. g. Mascot, Sequest, or PepSea). Any number of datasets can be parsed into EPIR and subsequently validated and mined using a set of software modules that overlay the database. These include a peptide validation module, a protein grouping module, a generic module for extracting quantitative data, a comparative module, and additional modules for extracting statistical information. In the present study, the utility of EPIR and associated software tools is demonstrated on LC MS/MS data derived from a set of model proteins and complex protein mixtures derived from MCF-7 breast cancer cells. Emphasis is placed on the key strengths of EPIR, including the ability to validate and mine multiple combined datasets, and presentation of protein-level evidence in concise, nonredundant protein groups that are based on shared peptide evidence.
引用
收藏
页码:1023 / 1038
页数:16
相关论文
共 21 条
[1]   Mass spectrometry-based proteomics [J].
Aebersold, R ;
Mann, M .
NATURE, 2003, 422 (6928) :198-207
[2]   Improved peptide charge state assignment [J].
Colinge, J ;
Magnin, J ;
Dessingy, T ;
Giron, M ;
Masselot, A .
PROTEOMICS, 2003, 3 (08) :1434-1440
[3]   OLAV: Towards high-throughput tandem mass spectrometry data identification [J].
Colinge, J ;
Masselot, A ;
Giron, M ;
Dessingy, T ;
Magnin, J .
PROTEOMICS, 2003, 3 (08) :1454-1463
[4]   AN APPROACH TO CORRELATE TANDEM MASS-SPECTRAL DATA OF PEPTIDES WITH AMINO-ACID-SEQUENCES IN A PROTEIN DATABASE [J].
ENG, JK ;
MCCORMACK, AL ;
YATES, JR .
JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 1994, 5 (11) :976-989
[5]   Preprocessing of tandem mass spectrometric data to support automatic protein identification [J].
Gentzel, M ;
Köcher, T ;
Ponnusamy, S ;
Wilm, M .
PROTEOMICS, 2003, 3 (08) :1597-1610
[6]   Quantitative analysis of complex protein mixtures using isotope-coded affinity tags [J].
Gygi, SP ;
Rist, B ;
Gerber, SA ;
Turecek, F ;
Gelb, MH ;
Aebersold, R .
NATURE BIOTECHNOLOGY, 1999, 17 (10) :994-999
[7]   Mining a tandem mass spectrometry database to determine the trends and global factors influencing peptide fragmentation [J].
Kapp, EA ;
Schütz, F ;
Reid, GE ;
Eddes, JS ;
Moritz, RL ;
O'Hair, RAJ ;
Speed, TP ;
Simpson, RJ .
ANALYTICAL CHEMISTRY, 2003, 75 (22) :6251-6264
[8]  
Keller Andrew, 2002, OMICS A Journal of Integrative Biology, V6, P207, DOI 10.1089/153623102760092805
[9]   Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes [J].
Krogh, A ;
Larsson, B ;
von Heijne, G ;
Sonnhammer, ELL .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 305 (03) :567-580
[10]   Peptide sequence motif analysis of tandem MS data with the SALSA algorithm [J].
Liebler, DC ;
Hansen, BT ;
Davey, SW ;
Tiscareno, L ;
Mason, DE .
ANALYTICAL CHEMISTRY, 2002, 74 (01) :203-210