Model-based cluster and discriminant analysis with the MIXMOD software

被引:106
作者
Biernacki, Christophe [1 ]
Celeux, Gilles
Govaert, Gerard
Langrognet, Florent
机构
[1] CNRS, UMR 8524, F-59655 Villeneuve Dascq, France
[2] Univ Lille 1, F-59655 Villeneuve Dascq, France
[3] INRIA Futurs, F-91405 Orsay, France
[4] Univ Technol Compiegne, F-60205 Compiegne, France
[5] CNRS, UMR 6599, F-60205 Compiegne, France
[6] Univ Franche Comte, F-25030 Besancon, France
[7] CNRS, UMR 6623, F-25030 Besancon, France
关键词
Gaussian models; EM-like algorithms; model selection;
D O I
10.1016/j.csda.2005.12.015
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Mixture Modeling (MIXMOD) program fits mixture models to a given data set for the purposes of density estimation, clustering or discriminant analysis. A large variety of algorithms to estimate the mixture parameters are proposed (EM, Classification EM, Stochastic EM), and it is possible to combine these to yield different strategies for obtaining a sensible maximum for the likelihood (or complete-data likelihood) function. MIXMOD is currently intended to be used for multivariate Gaussian mixtures, and fourteen different Gaussian models can be distinguished according to different assumptions regarding the component variance matrix eigenvalue decomposition. Moreover, different information criteria for choosing a parsimonious model (the number of mixture components, for instance) are included, their suitability depending on the particular perspective (cluster analysis or discriminant analysis). Written in C++, MIXMOD is interfaced with SCILAB and MATLAB. The program, the statistical documentation and the user guide are available on the internet at the following address: http://www-math.univ-fcomte.fr/mixmod/index.php. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:587 / 600
页数:14
相关论文
共 25 条
[11]   An entropy criterion for assessing the number of clusters in a mixture model [J].
Celeux, G ;
Soromenho, G .
JOURNAL OF CLASSIFICATION, 1996, 13 (02) :195-212
[12]   GAUSSIAN PARSIMONIOUS CLUSTERING MODELS [J].
CELEUX, G ;
GOVAERT, G .
PATTERN RECOGNITION, 1995, 28 (05) :781-793
[13]   A CLASSIFICATION EM ALGORITHM FOR CLUSTERING AND 2 STOCHASTIC VERSIONS [J].
CELEUX, G ;
GOVAERT, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1992, 14 (03) :315-332
[14]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[15]  
DIDAY E, 1974, CR ACAD SCI A MATH, V278, P993
[16]   How many clusters? Which clustering method? Answers via model-based cluster analysis [J].
Fraley, C ;
Raftery, AE .
COMPUTER JOURNAL, 1998, 41 (08) :578-588
[17]   ON SOME INVARIANT CRITERIA FOR GROUPING DATA [J].
FRIEDMAN, HP ;
RUBIN, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1967, 62 (320) :1159-&
[18]  
KERIBIN C, 2000, SANKHYA A, V1, P49
[19]   MULTIVARIATE CLUSTERING PROCEDURES WITH VARIABLE METRICS [J].
MARONNA, R ;
JACOVKIS, PM .
BIOMETRICS, 1974, 30 (03) :499-505
[20]  
McLachlan, 2004, DISCRIMINANT ANAL ST