Protein abundance profiling of the Escherichia coli cytosol

被引:374
作者
Ishihama, Yasushi [2 ,3 ]
Schmidt, Thorsten [1 ]
Rappsilber, Juri [2 ,7 ]
Mann, Matthias [2 ,4 ]
Hartl, F. Ulrich [5 ]
Kerner, Michael J. [6 ]
Frishman, Dmitrij [1 ,8 ]
机构
[1] Tech Univ Munich, Dept Genome Oriented Bioinformat, Wissensch Zentrum Weihenstephen, D-85350 Martinsried, Germany
[2] Univ So Denmark, Ctr Expt Bioinformat, DK-5230 Odense, Denmark
[3] Keio Univ, Inst Adv Biosci, Yamagata 9970017, Japan
[4] Max Planck Inst Biochem, Dept Proteom & Signal Transduct, D-82512 Martinsried, Germany
[5] Max Planck Inst Biochem, Dept Cellular Biochem, D-82152 Martinsried, Germany
[6] Tech Univ Denmark, Ctr Biol Sequence Anal, BioCentrum, DK-1726 Lyngby, Denmark
[7] Univ Edinburgh, Wellcome Trust Ctr Cell Biol, Edinburgh EHP 3JR, Midlothian, Scotland
[8] GSF, Natl Res Ctr Environm Hlth, D-85764 Neuherberg, Germany
关键词
D O I
10.1186/1471-2164-9-102
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Knowledge about the abundance of molecular components is an important prerequisite for building quantitative predictive models of cellular behavior. Proteins are central components of these models, since they carry out most of the fundamental processes in the cell. Thus far, protein concentrations have been difficult to measure on a large scale, but proteomic technologies have now advanced to a stage where this information becomes readily accessible. Results: Here, we describe an experimental scheme to maximize the coverage of proteins identified by mass spectrometry of a complex biological sample. Using a combination of LC-MS/MS approaches with protein and peptide fractionation steps we identified 1103 proteins from the cytosolic fraction of the Escherichia coli strain MC4100. A measure of abundance is presented for each of the identified proteins, based on the recently developed emPAI approach which takes into account the number of sequenced peptides per protein. The values of abundance are within a broad range and accurately reflect independently measured copy numbers per cell. As expected, the most abundant proteins were those involved in protein synthesis, most notably ribosomal proteins. Proteins involved in energy metabolism as well as those with binding function were also found in high copy number while proteins annotated with the terms metabolism, transcription, transport, and cellular organization were rare. The barrel-sandwich fold was found to be the structural fold with the highest abundance. Highly abundant proteins are predicted to be less prone to aggregation based on their length, pI values, and occurrence patterns of hydrophobic stretches. We also find that abundant proteins tend to be predominantly essential. Additionally we observe a significant correlation between protein and mRNA abundance in E. coli cells. Conclusion: Abundance measurements for more than 1000 E. coli proteins presented in this work represent the most complete study of protein abundance in a bacterial cell so far. We show significant associations between the abundance of a protein and its properties and functions in the cell. In this way, we provide both data and novel insights into the role of protein concentration in this model organism.
引用
收藏
页数:17
相关论文
共 72 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   SCOP database in 2004: refinements integrate structure and sequence family data [J].
Andreeva, A ;
Howorth, D ;
Brenner, SE ;
Hubbard, TJP ;
Chothia, C ;
Murzin, AG .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D226-D229
[3]  
[Anonymous], 2005, Data Mining Pratical Machine Learning Tools and Techniques
[4]  
[Anonymous], 1975, CLUSTERING ALGORITHM
[5]  
Barr JR, 1996, CLIN CHEM, V42, P1676
[6]   Post-transcriptional expression regulation in the yeast Saccharomyces cerevisiae on a genomic scale [J].
Beyer, A ;
Hollunder, J ;
Nasheuer, HP ;
Wilhelm, T .
MOLECULAR & CELLULAR PROTEOMICS, 2004, 3 (11) :1083-1092
[7]   The complete genome sequence of Escherichia coli K-12 [J].
Blattner, FR ;
Plunkett, G ;
Bloch, CA ;
Perna, NT ;
Burland, V ;
Riley, M ;
ColladoVides, J ;
Glasner, JD ;
Rode, CK ;
Mayhew, GF ;
Gregor, J ;
Davis, NW ;
Kirkpatrick, HA ;
Goeden, MA ;
Rose, DJ ;
Mau, B ;
Shao, Y .
SCIENCE, 1997, 277 (5331) :1453-+
[8]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[9]   SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence [J].
Cai, CZ ;
Han, LY ;
Ji, ZL ;
Chen, X ;
Chen, YZ .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3692-3697
[10]   Relative influence of hydrophobicity and net charge in the aggregation of two homologous proteins [J].
Calamai, M ;
Taddei, N ;
Stefani, M ;
Ramponi, G ;
Chiti, F .
BIOCHEMISTRY, 2003, 42 (51) :15078-15083