Alternative EM methods for nonparametric finite mixture models

被引:39
作者
Pilla, RS
Lindsay, BG
机构
[1] Univ Illinois, Div Epidemiol & Biostat, Chicago, IL 60612 USA
[2] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
augmentation; complete data; EM algorithm; finite mixture distribution; high-dimensional; maximum likelihood; missing data; rate of convergence; nonparametric mixture; zero-elimination;
D O I
10.1093/biomet/88.2.535
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This research focuses on a general class of maximum likelihood problems in which it is desired to maximise a nonparametric mixture likelihood with finitely many known component densities over the set of unknown weight parameters. Convergence of the conventional EM algorithm for this problem is extremely slow when the component densities are poorly separated and when the maximum likelihood estimator requires some of the weights to be zero, as the algorithm can never reach such a boundary point. Alternative methods based on the principles of EM are developed using a two-stage approach. First, a new data augmentation scheme provides improved convergence rates in certain parameter directions. Secondly, two 'cyclic versions' of this data augmentation are created by changing the missing data formulation between the EM-steps; these extend the acceleration directions to the whole parameter space, giving another order of magnitude increase in convergence rate. Examples indicate that the new cyclic versions of the data augmentation schemes can converge up to 500 times faster than the conventional EM algorithm for fitting nonparametric finite mixture models.
引用
收藏
页码:535 / 550
页数:16
相关论文
共 20 条
[1]   Interval censored data: A note on the nonparametric maximum likelihood estimator of the distribution function [J].
Bohning, D ;
Schlattmann, P ;
Dietz, E .
BIOMETRIKA, 1996, 83 (02) :462-466
[2]   MONOTONICITY OF QUADRATIC-APPROXIMATION ALGORITHMS [J].
BOHNING, D ;
LINDSAY, BG .
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1988, 40 (04) :641-663
[3]   Recent developments in computer-assisted analysis of mixtures [J].
Böhning, D ;
Dietz, E ;
Schlattmann, P .
BIOMETRICS, 1998, 54 (02) :525-536
[4]   COMPUTER-ASSISTED ANALYSIS OF MIXTURES (CAMAN) - STATISTICAL ALGORITHMS [J].
BOHNING, D ;
SCHLATTMANN, P ;
LINDSAY, B .
BIOMETRICS, 1992, 48 (01) :283-303
[5]   Stochastic versions of the EM algorithm: An experimental study in the mixture case [J].
Celeux, G ;
Chauveau, D ;
Diebolt, J .
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 1996, 55 (04) :287-314
[6]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[7]   Acceleration of the EM algorithm by using quasi-Newton methods [J].
Jamshidian, M ;
Jennrich, RI .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1997, 59 (03) :569-587
[8]  
LANGE K, 1995, J ROY STAT SOC B MET, V57, P425
[9]  
Lindsay B., 1995, NSF CBMS REGIONAL C, V5, DOI DOI 10.1214/CBMS/1462106013
[10]   RESIDUAL DIAGNOSTICS FOR MIXTURE-MODELS [J].
LINDSAY, BG ;
ROEDER, K .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1992, 87 (419) :785-794