Hierarchical, unsupervised learning with growing via phase transitions

被引:15
作者
Miller, D
Rose, K
机构
[1] Dept. of Elec. and Comp. Engineering, University of California, Santa Barbara
关键词
D O I
10.1162/neco.1996.8.2.425
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address unsupervised learning subject to structural constraints, with particular emphasis placed on clustering with an imposed decision tree structure. Most known methods are greedy, optimizing one node of the tree at a time to minimize a local cost. By constrast, we develop a joint optimization method, derived based on information-theoretic principles and closely related to known methods in statistical physics. The approach is inspired by the deterministic annealing algorithm for unstructured data clustering, which was based on maximum entropy inference. The new approach is founded on the principle of minimum cross-entropy, using informative priors to approximate the unstructured clustering solution while imposing the structural constraint. The resulting method incorporates supervised learning principles applied in an unsupervised problem setting. In our approach, the tree ''grows'' by a sequence of bifurcations that occur while optimizing an effective free energy cost at decreasing temperature scales. Thus, estimates of the tree size and structure are naturally obtained at each temperature in the process. Examples demonstrate considerable improvement over known methods.
引用
收藏
页码:425 / 450
页数:26
相关论文
共 42 条
[1]  
[Anonymous], 1963, PRINCIPLES NUMERICAL
[2]   A CLUSTERING TECHNIQUE FOR SUMMARIZING MULTIVARIATE DATA [J].
BALL, GH ;
HALL, DJ .
BEHAVIORAL SCIENCE, 1967, 12 (02) :153-&
[4]  
Breiman LJH, 1980, CLASSIFICATION REGRE
[5]   COMPLEXITY OPTIMIZED DATA CLUSTERING BY COMPETITIVE NEURAL NETWORKS [J].
BUHMANN, J ;
KUHNEL, H .
NEURAL COMPUTATION, 1993, 5 (01) :75-88
[6]  
BUHMANN J, 1994, IEEE T INFORM THEORY, V39, P1133
[7]   SPEECH CODING BASED UPON VECTOR QUANTIZATION [J].
BUZO, A ;
GRAY, AH ;
GRAY, RM ;
MARKEL, JD .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1980, 28 (05) :562-574
[8]   OPTIMAL PRUNING WITH APPLICATIONS TO TREE-STRUCTURED SOURCE-CODING AND MODELING [J].
CHOU, PA ;
LOOKABAUGH, T ;
GRAY, RM .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1989, 35 (02) :299-315
[9]  
Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[10]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38