High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics

被引:268
作者
Carvalho, Carlos M. [1 ]
Chang, Jeffrey
Lucas, Joseph E. [3 ]
Nevins, Joseph R. [2 ,4 ]
Wang, Quanli
West, Mike [3 ]
机构
[1] Univ Chicago, Grad Sch Business, Chicago, IL 60637 USA
[2] Duke Univ, Inst Genome Sci & Policy, Ctr Appl Genom & Technol, Durham, NC 27710 USA
[3] Duke Univ, Dept Stat Sci, Durham, NC 27708 USA
[4] Duke Univ, Med Ctr, Dept Mol Genet & Microbiol, Durham, NC 27706 USA
基金
美国国家科学基金会;
关键词
Biological pathways; Breast cancer genomics; Decomposing gene expression patterns; Dirichlet process factor model; Evolutionary stochastic search; Factor regression; Gene expression analysis; Gene expression profiling; Gene networks; Non-Gaussian multivariate analysis; Sparse factor models; Sparsity priors;
D O I
10.1198/016214508000000869
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We describe Studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case Studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, its well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived "factors" as representing biological "subpathway" structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well its scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric component,,. for latent structure. underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway Studies. Supplementary supporting material provides more details of the applications, its well as examples of the use of freely available software tools for implementing the methodology.
引用
收藏
页码:1438 / 1456
页数:19
相关论文
共 40 条
[1]   Bayesian dynamic factor models and portfolio allocation [J].
Aguilar, O ;
West, M .
JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2000, 18 (03) :338-357
[2]  
Albert J. H., 1999, ORDINAL DATA MODELS
[3]   Bayesian hierarchical model for identifying changes in gene expression from microarray experiments [J].
Broët, P ;
Richardson, S ;
Radvanyi, F .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (04) :671-683
[4]  
CARVALHO C, 2006, THESIS DUKE U
[5]   Model uncertainty [J].
Clyde, M ;
George, EI .
STATISTICAL SCIENCE, 2004, 19 (01) :81-94
[6]   Computing Bayes factors by combining simulation and asymptotic approximations [J].
DiCiccio, TJ ;
Kass, RE ;
Raftery, A ;
Wasserman, L .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (439) :903-915
[7]   A Bayesian mixture model for differential gene expression [J].
Do, KA ;
Müller, P ;
Tang, F .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2005, 54 :627-644
[8]   Sparse graphical models for exploring gene expression data [J].
Dobra, A ;
Hans, C ;
Jones, B ;
Nevins, JR ;
Yao, GA ;
West, M .
JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) :196-212
[9]   Gene expression profiles of multiple breast cancer phenotypes and response to neoadjuvant chemotherapy [J].
Dressman, HK ;
Hans, C ;
Bild, A ;
Olson, JA ;
Rosen, E ;
Marcom, PK ;
Liotcheva, VB ;
Jones, EL ;
Vujaskovic, Z ;
Marks, J ;
Dewhirst, MW ;
West, M ;
Nevins, JR ;
Blackwell, K .
CLINICAL CANCER RESEARCH, 2006, 12 (03) :819-826
[10]   BAYESIAN DENSITY-ESTIMATION AND INFERENCE USING MIXTURES [J].
ESCOBAR, MD ;
WEST, M .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (430) :577-588