pep2pro: a new tool for comprehensive proteome data analysis to reveal information about organ-specific proteomes in Arabidopsis thaliana

被引:60
作者
Baerenfaller, Katja [1 ]
Hirsch-Hoffmann, Matthias [1 ]
Svozil, Julia [1 ]
Hull, Roger [2 ]
Russenberger, Doris [1 ]
Bischof, Sylvain [1 ]
Lu, Qingtao [3 ]
Gruissem, Wilhelm [1 ]
Baginsky, Sacha [1 ]
机构
[1] ETH, Dept Biol, CH-8092 Zurich, Switzerland
[2] Univ Manchester, Fac Life Sci, Manchester M13 9PL, Lancs, England
[3] Chinese Acad Sci, Inst Bot, Beijing 100093, Peoples R China
关键词
STATISTICAL-MODEL; TANDEM; GENES; PROTEINS; IDENTIFICATIONS; PEROXISOMES; ANNOTATION; PEPTIDES; GENOMICS; MS/MS;
D O I
10.1039/c0ib00078g
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
pep2pro is a comprehensive proteome analysis database specifically suitable for flexible proteome data analysis. The pep2pro database schema offers solutions to the various challenges of developing a proteome data analysis database and because data integrated in pep2pro are in relational format, it enables flexible and detailed data analysis. The information provided here will facilitate building proteome data analysis databases for other organisms or applications. The capacity of the pep2pro database for the integration and analysis of large proteome datasets was demonstrated by creating the pep2pro dataset, which is an organ-specific characterisation of the Arabidopsis thaliana proteome containing 14 522 identified proteins based on 2.6 million peptide spectrum assignments. This dataset provides evidence of protein expression and reveals organ-specific processes. The high coverage and density of the dataset are essential for protein quantification by normalised spectral counting and allowed us to extract information that is usually not accessible in low-coverage datasets. With this quantitative protein information we analysed organ- and organelle-specific sub-proteomes. In addition we matched spectra to regions in the genome that were not predicted to have protein coding capacity and provide PCR validation for selected revised gene models. Furthermore, we analysed the peptide features that distinguish detected from non-detected peptides and found substantial disagreement between predicted and detected proteotypic peptides, suggesting that large-scale proteomics data are essential for efficient selection of proteotypic peptides in targeted proteomics surveys. The pep2pro dataset is available as a resource for plant systems biology at www.pep2pro.ethz.ch.
引用
收藏
页码:225 / 237
页数:13
相关论文
共 43 条
[21]   Annotating genes of known and unknown function by large-scale coexpression analysis [J].
Horan, Kevin ;
Jang, Charles ;
Bailey-Serres, Julia ;
Mittler, Ron ;
Shelton, Christian ;
Harper, Jeff F. ;
Zhu, Jian-Kang ;
Cushman, John C. ;
Gollery, Martin ;
Girke, Thomas .
PLANT PHYSIOLOGY, 2008, 147 (01) :41-57
[22]   ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites [J].
Hummel, Jan ;
Niemann, Michaela ;
Wienkoop, Stefanie ;
Schulze, Waltraud ;
Steinhauser, Dirk ;
Selbig, Joachim ;
Walther, Dirk ;
Weckwerth, Wolfram .
BMC BIOINFORMATICS, 2007, 8 (1)
[23]  
JOHSI HJ, 2010, PLANT PHYSL
[24]   Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search [J].
Keller, A ;
Nesvizhskii, AI ;
Kolker, E ;
Aebersold, R .
ANALYTICAL CHEMISTRY, 2002, 74 (20) :5383-5392
[25]   plprot: A comprehensive proteome database for different plastid types [J].
Kleffmann, T ;
Hirsch-Hoffmann, M ;
Gruissem, W ;
Baginsky, S .
PLANT AND CELL PHYSIOLOGY, 2006, 47 (03) :432-436
[26]   Selected reaction monitoring for quantitative proteomics: a tutorial [J].
Lange, Vinzenz ;
Picotti, Paola ;
Domon, Bruno ;
Aebersold, Ruedi .
MOLECULAR SYSTEMS BIOLOGY, 2008, 4 (1)
[27]   Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation [J].
Lu, Peng ;
Vogel, Christine ;
Wang, Rong ;
Yao, Xin ;
Marcotte, Edward M. .
NATURE BIOTECHNOLOGY, 2007, 25 (01) :117-124
[28]   eComputational prediction of proteotypic peptides for quantitative proteomics [J].
Mallick, Parag ;
Schirle, Markus ;
Chen, Sharon S. ;
Flory, Mark R. ;
Lee, Hookeun ;
Martin, Daniel ;
Raught, Brian ;
Schmitt, Robert ;
Werner, Thilo ;
Kuster, Bernhard ;
Aebersold, Ruedi .
NATURE BIOTECHNOLOGY, 2007, 25 (01) :125-131
[29]   Large-Scale Comparative Phosphoproteomics Identifies Conserved Phosphorylation Sites in Plants [J].
Nakagami, Hirofumi ;
Sugiyama, Naoyuki ;
Mochida, Keiichi ;
Daudi, Arsalan ;
Yoshida, Yuko ;
Toyoda, Tetsuro ;
Tomita, Masaru ;
Ishihama, Yasushi ;
Shirasu, Ken .
PLANT PHYSIOLOGY, 2010, 153 (03) :1161-1174
[30]   A statistical model for identifying proteins by tandem mass spectrometry [J].
Nesvizhskii, AI ;
Keller, A ;
Kolker, E ;
Aebersold, R .
ANALYTICAL CHEMISTRY, 2003, 75 (17) :4646-4658