A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes

被引:1190
作者
Baldi, P [1 ]
Long, AD
机构
[1] Univ Calif Irvine, Dept Informat & Comp Sci, Irvine, CA 92697 USA
[2] Univ Calif Irvine, Dept Ecol & Evolut Biol, Inst Genom & Bioinformat, Irvine, CA 92697 USA
关键词
D O I
10.1093/bioinformatics/17.6.509
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: DNA microarrays are now capable of providing genome-wide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. Results: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model log-expression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a t-test, provide a systematic inference approach that compares favorably with simple t-test or fold methods, and partly compensate for the lack of replication.
引用
收藏
页码:509 / 519
页数:11
相关论文
共 30 条
  • [1] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [2] Global gene expression profiling in Escherichia coli K12 -: The effects of integration host factor
    Arfin, SM
    Long, AD
    Ito, ET
    Tolleri, L
    Riehle, MM
    Paegle, ES
    Hatfield, GW
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 2000, 275 (38) : 29672 - 29684
  • [3] Baldi P, 2001, BIOINFORMATICS MACHI
  • [4] BERGER J. O., 2013, Statistical Decision Theory and Bayesian Analysis, DOI [10.1007/978-1-4757-4286-2, DOI 10.1007/978-1-4757-4286-2]
  • [5] Box GE., 2011, BAYESIAN INFERENCE S
  • [6] Exploring the metabolic and genetic control of gene expression on a genomic scale
    DeRisi, JL
    Iyer, VR
    Brown, PO
    [J]. SCIENCE, 1997, 278 (5338) : 680 - 686
  • [7] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [8] LIGHT-DIRECTED, SPATIALLY ADDRESSABLE PARALLEL CHEMICAL SYNTHESIS
    FODOR, SPA
    READ, JL
    PIRRUNG, MC
    STRYER, L
    LU, AT
    SOLAS, D
    [J]. SCIENCE, 1991, 251 (4995) : 767 - 773
  • [9] Support vector machine classification and validation of cancer tissue samples using microarray expression data
    Furey, TS
    Cristianini, N
    Duffy, N
    Bednarski, DW
    Schummer, M
    Haussler, D
    [J]. BIOINFORMATICS, 2000, 16 (10) : 906 - 914
  • [10] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537