Analysis of mRNA expression and protein abundance data: an approach for the comparison of the enrichment of features in the cellular population of proteins and transcripts

被引:132
作者
Greenbaum, D
Jansen, R
Gerstein, M
机构
[1] Yale Univ, Dept Mol Biophys & Biochem, New Haven, CT 06520 USA
[2] Yale Univ, Dept Comp Sci, New Haven, CT 06520 USA
[3] Yale Univ, Dept Genet, New Haven, CT 06520 USA
关键词
D O I
10.1093/bioinformatics/18.4.585
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein abundance is related to mRNA expression through many different cellular processes, Up to now, there have been conflicting results on how correlated the levels of these two quantities are. Given that expression and abundance data are significantly more complex and noisy than the underlying genomic sequence information, it is reasonable to simplify and average them in terms of broad proteomic categories and features (e.g. functions or secondary structures), for understanding their relationship. Furthermore, it will be essential to integrate, within a common framework, the results of many varied experiments by different investigators. This will allow one to survey the characteristics of highly expressed genes and proteins. Results: To this end, we outline a formalism for merging and scaling many different gene expression and protein abundance data sets into a comprehensive reference set, and we develop an approach for analyzing this in terms of broad categories, such as composition, function, structure and localization. As the various experiments are not always done using the same set of genes, sampling bias becomes a central issue, and our formalism is designed to explicitly show this and correct for it. We apply our formalism to the currently available gene expression and protein abundance data for yeast. Overall, we found substantial agreement between gene expression and protein abundance, in terms of the enrichment of structural and functional categories. This agreement, which was considerably greater than the simple correlation between these quantities for individual genes, reflects the way that in comparison to the population of genes in the yeast genome, the cellular populations of transcripts and proteins (weighted by their respective abundances, the transcriptome and what we dub the translatome) were both enriched in: (i) the small amino acids Val, Gly, and Ala; (ii) low molecular weight proteins; (iii) helices and sheets relative to coils; (iv) cytoplasmic proteins relative to nuclear ones; and (v) proteins involved in 'protein synthesis,' 'cell structure,' and 'energy production.' Supplementary information: http://genecensus.org/ expression/translatome Contact: mark.gerstein@yale.edu.
引用
收藏
页码:585 / 596
页数:12
相关论文
共 100 条
[1]   GEL-ELECTROPHORETIC ANALYSIS OF ZYMOMONAS-MOBILIS GLYCOLYTIC AND FERMENTATIVE ENZYMES - IDENTIFICATION OF ALCOHOL DEHYDROGENASE-II AS A STRESS PROTEIN [J].
AN, HJ ;
SCOPES, RK ;
RODRIGUEZ, M ;
KESHAV, KF ;
INGRAM, LO .
JOURNAL OF BACTERIOLOGY, 1991, 173 (19) :5975-5982
[2]   A comparison of selected mRNA and protein abundances in human liver [J].
Anderson, L ;
Seilhamer, J .
ELECTROPHORESIS, 1997, 18 (3-4) :533-537
[3]   Serendipity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times! [J].
Bairoch, A .
BIOINFORMATICS, 2000, 16 (01) :48-64
[4]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[5]   Exploiting the complete yeast genome sequence [J].
Bassett, DE ;
Basrai, MA ;
Connelly, C ;
Hyland, KM ;
Kitagawa, K ;
Mayer, ML ;
Morrow, DM ;
Page, AM ;
Resto, VA ;
Skibbens, RV ;
Hieter, P .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 1996, 6 (06) :763-766
[6]   A POSSIBLE INVIVO MECHANISM OF INTERMEDIATE TRANSFER BY GLYCOLYTIC ENZYME COMPLEXES - STEADY-STATE FLUORESCENCE ANISOTROPY ANALYSIS OF AN ENZYME COMPLEX-FORMATION [J].
BATKE, J ;
BENITO, VA ;
TOMPA, P .
ARCHIVES OF BIOCHEMISTRY AND BIOPHYSICS, 1992, 296 (02) :654-659
[7]   Structural and genomic correlates of hyperthermostability [J].
Cambillau, C ;
Claverie, JM .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2000, 275 (42) :32383-32386
[8]   Unique identification of proteins from small genome organisms: Theoretical feasibility of high throughput proteome analysis [J].
Cavalcoli, JD ;
VanBogelen, RA ;
Andrews, PC ;
Moldover, B .
ELECTROPHORESIS, 1997, 18 (15) :2703-2708
[9]   Computational methods for the identification of differential and coordinated gene expression [J].
Claverie, JM .
HUMAN MOLECULAR GENETICS, 1999, 8 (10) :1821-1832
[10]  
Corthals GL, 2000, ELECTROPHORESIS, V21, P1104, DOI 10.1002/(SICI)1522-2683(20000401)21:6<1104::AID-ELPS1104>3.0.CO