Phase transitions in stochastic self-organizing maps

被引:67
作者
Graepel, T
Burger, M
Obermayer, K
机构
[1] Fachbereich Informatik FR 2-1, Technische Universität Berlin, Berlin, 10587
来源
PHYSICAL REVIEW E | 1997年 / 56卷 / 04期
关键词
D O I
10.1103/PhysRevE.56.3876
中图分类号
O35 [流体力学]; O53 [等离子体物理学];
学科分类号
070204 ; 080103 ; 080704 ;
摘要
We describe the development of neighborhood-preserving stochastic maps in terms of a probabilistic clustering problem. Starting from a cost function for central clustering that incorporates distortions from channel noise, we derive a soft topographic vector quantization algorithm (STVQ) which is based on the maximum entropy principle, and which maximizes the corresponding likelihood in an expectation-maximization fashion. Among other algorithms, a probabilistic version of Kohonen's self-organizing map (SOM) is derived from STVQ as a computationally efficient approximation of the E step. The foundation of STVQ in statistical physics motivates a deterministic annealing scheme in the temperature parameter beta, and leads to a robust minimization algorithm of the clustering cost function. In particular, this scheme offers an alternative to the common stepwise shrinking of the neighborhood width in the SOM, and makes it possible to use its neighborhood function solely to encode the desired neighborhood relations between the clusters. The annealing in beta, which corresponds to a stepwise refinement of the resolution of representation in data space, leads to the splitting of an existing cluster representation during the ''cooling'' process. We describe this phase transition in terms of the covariance matrix C of the data and the transition matrix H of the channel noise, and calculate the critical temperatures and modes as functions of the eigenvalues and eigenvectors of C and H. The analysis is extended to the phenomenon of the automatic selection of feature dimensions in dimension-reducing maps, thus leading to a ''batch'' alternative to the Fokker-Planck formalism for on-line learning. The results provide insights into the relation between the width of the neighborhood and the temperature parameter beta: It is shown that the phase transition which leads to the representation of the excess dimensions can be triggered not only by a change in the statistics of the input data but also by an increase of beta, which corresponds to a decrease in noise level. The theoretical results are validated by numerical methods. In particular, a quantity equivalent to the heat capacity in thermodynamics is introduced to visualize the properties of the annealing process.
引用
收藏
页码:3876 / 3890
页数:15
相关论文
共 29 条
[1]   VECTOR QUANTIZATION WITH COMPLEXITY COSTS [J].
BUHMANN, J ;
KUHNEL, H .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (04) :1133-1145
[2]  
BUHMANN J, UNPUB
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]   A DIMENSION REDUCTION FRAMEWORK FOR UNDERSTANDING CORTICAL MAPS [J].
DURBIN, R ;
MITCHISON, G .
NATURE, 1990, 343 (6259) :644-647
[5]   MODELS OF ORIENTATION AND OCULAR DOMINANCE COLUMNS IN THE VISUAL-CORTEX - A CRITICAL COMPARISON [J].
ERWIN, E ;
OBERMAYER, K ;
SCHULTEN, K .
NEURAL COMPUTATION, 1995, 7 (03) :425-468
[6]   STOCHASTIC RELAXATION, GIBBS DISTRIBUTIONS, AND THE BAYESIAN RESTORATION OF IMAGES [J].
GEMAN, S ;
GEMAN, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (06) :721-741
[7]  
Heskes T. M., 1993, P IEEE ICNN, VIII, P1219
[8]   INFORMATION THEORY AND STATISTICAL MECHANICS [J].
JAYNES, ET .
PHYSICAL REVIEW, 1957, 106 (04) :620-630
[9]  
KARHUNEN K, 1947, ANN ACAD SCI FENN A1, V37, P3
[10]   ANALYSIS OF A SIMPLE SELF-ORGANIZING PROCESS [J].
KOHONEN, T .
BIOLOGICAL CYBERNETICS, 1982, 44 (02) :135-140