Initializing EM using the properties of its trajectories in Gaussian mixtures

被引:28
作者
Biernacki, C [1 ]
机构
[1] Univ Franche Comte, CNRS, UMR 6623, F-25030 Besancon, France
关键词
maximum likelihood; constraints on parameters; sample moments; random initialization; Monte-Carlo simulations;
D O I
10.1023/B:STCO.0000035306.77434.31
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A strategy is proposed to initialize the EM algorithm in the multivariate Gaussian mixture context. It consists in randomly drawing, with a low computational cost in many situations, initial mixture parameters in an appropriate space including all possible EM trajectories. This space is simply defined by two relations between the two first empirical moments and the mixture parameters satisfied by any EM iteration. An experimental study on simulated and real data sets clearly shows that this strategy outperforms classical methods, since it has the nice property to widely explore local maxima of the likelihood function.
引用
收藏
页码:267 / 279
页数:13
相关论文
共 19 条
[1]   Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2003, 41 (3-4) :561-575
[2]   GAUSSIAN PARSIMONIOUS CLUSTERING MODELS [J].
CELEUX, G ;
GOVAERT, G .
PATTERN RECOGNITION, 1995, 28 (05) :781-793
[3]   A component-wise EM algorithm for mixtures [J].
Celeux, G ;
Chrétien, S ;
Forbes, F ;
Mkhadri, A .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2001, 10 (04) :697-712
[4]   A CLASSIFICATION EM ALGORITHM FOR CLUSTERING AND 2 STOCHASTIC VERSIONS [J].
CELEUX, G ;
GOVAERT, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1992, 14 (03) :315-332
[5]   Cluster analysis for large datasets: An effective algorithm for maximizing the mixture likelihood [J].
Coleman, DA ;
Woodruff, DL .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2000, 9 (04) :672-688
[6]  
DAY NE, 1969, BIOMETRIKA, V56, P463, DOI 10.1093/biomet/56.3.463
[7]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[8]  
Horn RA., 1992, Matrix analysis
[9]  
Kotz S., 1982, ENCY STAT SCI
[10]  
LIU C, 1997, ASA P STAT COMP SESS, P109