Singularity and Slow Convergence of the EM algorithm for Gaussian Mixtures

被引:25
作者
Park, Hyeyoung [1 ]
Ozeki, Tomoko [2 ]
机构
[1] Kyungpook Natl Univ, Sch Elect Engn & Comp Sci, Taegu 702701, South Korea
[2] Tokai Univ, Dept Human & Informat Sci, Kanagawa 2591292, Japan
基金
日本学术振兴会;
关键词
EM algorithm; Gradient descent learning; Learning dynamics; Singularity; Slow convergence; NEURAL-NETWORK REGRESSION; SOFT COMMITTEE MACHINES; MULTILAYER PERCEPTRONS; DYNAMICS; MODELS;
D O I
10.1007/s11063-009-9094-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Singularities in the parameter spaces of hierarchical learning machines are known to be a main cause of slow convergence of gradient descent learning. The EM algorithm, which is another learning algorithm giving a maximum likelihood estimator, is also suffering from its slow convergence, which often appears when the component overlap is large. We analyze the dynamics of the EM algorithm for Gaussian mixtures around singularities and show that there exists a slow manifold caused by a singular structure, which is closely related to the slow convergence of the EM algorithm. We also conduct numerical simulations to confirm the theoretical analysis. Through the simulations, we compare the dynamics of the EM algorithm with that of the gradient descent algorithm, and show that their slow dynamics are caused by the same singular structure, and thus they have the same behaviors around singularities.
引用
收藏
页码:45 / 59
页数:15
相关论文
共 22 条
[1]   Adaptive method of realizing natural gradient learning for multilayer perceptrons [J].
Amari, S ;
Park, H ;
Fukumizu, K .
NEURAL COMPUTATION, 2000, 12 (06) :1399-1409
[2]  
Amari S, 2002, ADV NEUR IN, V14, P343
[3]   Learning and inference in hierarchical models with singularities [J].
Amari, Shun-Ichi ;
Ozeki, Tomoko ;
Park, Hyeyoung .
Systems and Computers in Japan, 2003, 34 (07) :34-42
[4]   Singularities affect dynamics of learning in neuromanifolds [J].
Amari, Shun-ichi ;
Park, Hyeyoung ;
Ozeki, Tomoko .
NEURAL COMPUTATION, 2006, 18 (05) :1007-1065
[5]  
[Anonymous], 2000, WILEY SERIES PROBABI
[6]  
[Anonymous], 2000, INFORM GEOMETRY
[7]   Local minima and plateaus in hierarchical structures of multilayer perceptrons [J].
Fukumizu, K ;
Amari, S .
NEURAL NETWORKS, 2000, 13 (03) :317-327
[8]   Likelihood ratio of unidentifiable models and multilayer neural networks [J].
Fukumizu, K .
ANNALS OF STATISTICS, 2003, 31 (03) :833-851
[9]   On the problem in model selection of neural network regression in overrealizable scenario [J].
Hagiwara, K .
NEURAL COMPUTATION, 2002, 14 (08) :1979-2002
[10]   Upper bound of the expected training error of neural network regression for a Gaussian noise sequence [J].
Hagiwara, K ;
Hayasaka, T ;
Toda, N ;
Usui, S ;
Kuno, K .
NEURAL NETWORKS, 2001, 14 (10) :1419-1429