Cluster analysis of gene expression dynamics

被引:323
作者
Ramoni, MF
Sebatiani, P
Kohane, IS [1 ]
机构
[1] Harvard Univ, Sch Med, Childrens Hosp Informat Program, Boston, MA 02115 USA
[2] Univ Massachusetts, Dept Math & Stat, Amherst, MA 01003 USA
关键词
D O I
10.1073/pnas.132656399
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This article presents a Bayesian method for model-based clustering of gene expression dynamics. The method represents gene-expression dynamics as autoregressive equations and uses an agglomerative procedure to search for the most probable set of clusters given the available data. The main contributions of this approach are the ability to take into account the dynamic nature of gene expression time series during clustering and a principled way to identify the number of distinct clusters. As the number of possible clustering models grows exponentially with the number of observed time series, we have devised a distance-based heuristic search procedure able to render the search process feasible. In this way, the method retains the important visualization capability of traditional distance-based clustering and acquires an independent, principled measure to decide when two series are different enough to belong to different clusters. The reliance of this method on an explicit statistical representation of gene expression dynamics makes it possible to use standard statistical techniques to assess the goodness of fit of the resulting model and validate the underlying assumptions. A set of gene-expression time series, collected to study the response of human fibroblasts to serum, is used to identify the properties of the method.
引用
收藏
页码:9121 / 9126
页数:6
相关论文
共 27 条
[1]   Aligning gene expression time series with time warping algorithms [J].
Aach, J ;
Church, GM .
BIOINFORMATICS, 2001, 17 (06) :495-508
[2]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267, DOI [DOI 10.1007/978-1-4612-1694-0_15, 10.1007/978-1-4612-1694-0_15]
[3]   Singular value decomposition for genome-wide expression data processing and modeling [J].
Alter, O ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (18) :10101-10106
[4]  
Box GEP., 1976, TIME SERIES ANAL FOR
[5]   Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks [J].
Butte, AJ ;
Tamayo, P ;
Slonim, D ;
Golub, TR ;
Kohane, IS .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (22) :12182-12186
[6]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[7]   The human mitochondrial transcription termination factor (mTERF) is a multizipper protein but binds to DNA as a monomer, with evidence pointing to intramolecular leucine zipper interactions [J].
FernandezSilva, P ;
MartinezAzorin, F ;
Micol, V ;
Attardi, G .
EMBO JOURNAL, 1997, 16 (05) :1066-1079
[8]   THE HOT HAND IN BASKETBALL - ON THE MISPERCEPTION OF RANDOM SEQUENCES [J].
GILOVICH, T ;
VALLONE, R ;
TVERSKY, A .
COGNITIVE PSYCHOLOGY, 1985, 17 (03) :295-314
[9]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[10]   CLINICAL MONITORING USING REGRESSION-BASED TREND TEMPLATES [J].
HAIMOWITZ, IJ ;
LE, PP ;
KOHANE, IS .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 1995, 7 (06) :473-496