A Bayesian approach to model selection in hierarchical mixtures-of-experts architectures

被引:43
作者
Jacobs, RA
Peng, FC
Tanner, MA
机构
[1] UNIV NEBRASKA,LINCOLN,NE
[2] NORTHWESTERN UNIV,EVANSTON,IL 60208
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
modular architecture; hierarchical architecture; model selection; Bayesian analysis; Gibbs sampling;
D O I
10.1016/S0893-6080(96)00050-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There does not exist a statistical model that shows good performance on all tasks. Consequently, the model selection problem is unavoidable; investigators must decide which model is best at summarizing the data for each task of interest. This article presents an approach to the model selection problem in hierarchical mixtures-of-experts architectures. These architectures combine aspects of generalized linear models with those of finite mixture models in order to perform tasks via a recursive ''divide-and-conquer'' strategy. Markov chain Monte Carlo methodology is used to estimate the distribution of the architectures' parameters. One part of our approach to model selection attempts to estimate the worth of each component of an architecture so that relatively unused components can be pruned from the architecture's structure. A second part of this approach uses a Bayesian hypothesis testing procedure in order to differentiate inputs that carry useful information from nuisance inputs. Simulation results suggest that the approach presented here adheres to the dictum of Occam's razor; simple architectures that are adequate for summarizing the data are favored over more complex structures. (C) 1997 Elsevier Science Ltd. All Rights Reserved.
引用
收藏
页码:231 / 241
页数:11
相关论文
共 27 条
[1]  
[Anonymous], 1990, Report No
[2]  
[Anonymous], NEUROCOMPUTING ALGOR
[3]  
BELEW RK, 1993, ADV NEURAL INFORMATI, V5
[4]  
Box GE., 2011, BAYESIAN INFERENCE S
[5]  
Breiman L., 1984, Classification and Regression Trees, DOI DOI 10.2307/2530946
[6]  
Denker J., 1987, Complex Systems, V1, P877
[7]  
FAHLMAN SE, 1990, ADV NEURAL INFORMATI, V2
[8]  
Gelman A., 1992, Stat. Sci., V7, P457, DOI DOI 10.1214/SS/1177011136
[9]   Adaptive Mixtures of Local Experts [J].
Jacobs, Robert A. ;
Jordan, Michael I. ;
Nowlan, Steven J. ;
Hinton, Geoffrey E. .
NEURAL COMPUTATION, 1991, 3 (01) :79-87
[10]   HIERARCHICAL MIXTURES OF EXPERTS AND THE EM ALGORITHM [J].
JORDAN, MI ;
JACOBS, RA .
NEURAL COMPUTATION, 1994, 6 (02) :181-214