Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments

被引：67

作者：

Celeux, G

Martin, O

Lavergne, C

机构：

[1] INRA, Unite Proteom, F-34060 Montpellier, France

[2] Univ Paris Sud, Dept Math, Paris, France

[3] Inst Math & Modelisat Montpellier, Montpellier, France

来源：

STATISTICAL MODELLING | 2005年 / 5卷 / 03期

关键词：

cluster analysis; gene expression profile; linear model; mixture model; penalized likelihood criteria; random effect;

D O I：

10.1191/1471082X05st096oa

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Data variability can be important in microarray data analysis. Thus, when clustering gene expression profiles, it could be judicious to make use of repeated data. In this paper, the problem of analysing repeated data in the model-based cluster analysis context is considered. Linear mixed models are chosen to take into account data variability and mixture of these models are considered. This leads to a large range of possible models depending on the assumptions made on both the covariance structure of the observations and the mixture model. The maximum likelihood estimation of this family of models through the EM algorithm is presented. The problem of selecting a particular mixture of linear mixed models is considered using penalized likelihood criteria. Illustrative Monte Carlo experiments are presented and an application to the clustering of gene expression profiles is detailed. All those experiments highlight the interest of linear mixed model mixtures to take into account data variability in a cluster analysis context.

引用

页码：243 / 267

页数：25

共 29 条

[1] NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].