Mixture models for classification

被引:13
作者
Celeux, Gilles [1 ]
机构
[1] Inria Futurs, Orsay, France
来源
Advances in Data Analysis | 2007年
关键词
D O I
10.1007/978-3-540-70981-7_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Finite mixture distributions provide efficient approaches of model-based clustering and classification. The advantages of mixture models for unsupervised classification are reviewed. Then, the article is focusing on the model selection problem. The usefulness of taking into account the modeling purpose when selecting a model is advocated in the unsupervised and supervised classification contexts. This point of view had lead to the definition of two penalized likelihood criteria, ICL and BEC, which are presented and discussed. Criterion ICL is the approximation of the integrated completed likelihood and is concerned with model-based cluster analysis. Criterion BEC is the approximation of the integrated conditional likelihood and is concerned with generative models of classification. The behavior of ICL for choosing the number of components in a mixture model and of BEC to choose a model minimizing the expected error rate are analyzed in contrast with standard model selection criteria.
引用
收藏
页码:3 / 14
页数:12
相关论文
共 31 条
[1]   Likelihood and Bayesian analysis of mixtures [J].
Aitkin, Murray .
STATISTICAL MODELLING, 2001, 1 (04) :287-304
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]  
[Anonymous], 2000, WILEY SERIES PROBABI
[4]  
[Anonymous], J ROYAL STAT SOC B
[5]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[6]   Regularized Gaussian discriminant analysis through eigenvalue decomposition [J].
Bensmail, H ;
Celeux, G .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1996, 91 (436) :1743-1748
[7]   Assessing a mixture model for clustering with the integrated completed likelihood [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (07) :719-725
[8]   Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2003, 41 (3-4) :561-575
[9]  
BIERNACKI C, 2006, IN PRESS COMPUTATION
[10]   Selection of generative models in classification [J].
Bouchard, G ;
Celeux, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (04) :544-554