An information theoretic approach for combining neural network process models

被引:49
作者
Sridhar, DV
Bartlett, EB [1 ]
Seagrave, RC
机构
[1] Iowa State Univ, Dept Chem Engn, Ames, IA 50011 USA
[2] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA
关键词
chemical process; combining models; stacked generalization; information theory; prediction; generalization; subset selection;
D O I
10.1016/S0893-6080(99)00030-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Typically neural network modelers in chemical engineering focus on identifying and using a single, hopefully optimal, neural network model. Using a single optimal model implicitly assumes that one neural network model can extract all the information available in a given data set and that the other candidate models are redundant. In general, there is no assurance that any individual model has extracted all relevant information from the data set. Recently, Wolpert (Neural Networks, 5(2), 241 (1992)) proposed the idea of stacked generalization to combine multiple models. Sridhar, Seagrave and Barlett (AIChE J., 42, 2529 (1996)) implemented the stacked generalization for neural network models by integrating multiple neural networks into an architecture known as stacked neural networks (SNNs). SNNs consist of a combination of the candidate neural networks and were shown to provide ilnpfoved modeling of chemical processes. However, in Sridhar's work, SNNs were limited to using a linear combination of artificial neural networks. While a linear combination is simple and easy to use, it can utilize only those model outputs that have a high linear correlation to the output. Models that are useful in a nonlinear sense are wasted if a linear combination is used. In this work we propose an information theoretic stacking (ITS) algorithm for combining neural network models, The ITS algorithm identifies and combines useful models regardless of the nature of their relationship to the actual output. The power of the ITS algorithm is demonstrated through three examples including application to a dynamic process modeling problem. The results obtained demonstrate that the SNNs developed using the ITS algorithm can achieve highly improved performance as compared to selecting and using a single hopefully optimal network or using SNNs based on a linear combination of neural networks. (C) 1999 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:915 / 926
页数:12
相关论文
共 28 条
[1]  
ASH RB, 1990, INFORMATION THEORY
[2]   USE OF NEURAL NETS FOR DYNAMIC MODELING AND CONTROL OF CHEMICAL PROCESS SYSTEMS [J].
BHAT, N ;
MCAVOY, TJ .
COMPUTERS & CHEMICAL ENGINEERING, 1990, 14 (4-5) :573-583
[3]  
Cybenko G., 1989, Mathematics of Control, Signals, and Systems, V2, P303, DOI 10.1007/BF02551274
[4]   A COMPARISON OF 2 NONPARAMETRIC-ESTIMATION SCHEMES - MARS AND NEURAL NETWORKS [J].
DEVEAUX, RD ;
PSICHOGIOS, DC ;
UNGAR, LH .
COMPUTERS & CHEMICAL ENGINEERING, 1993, 17 (08) :819-837
[5]   INTERNAL MODEL CONTROL .5. EXTENSION TO NONLINEAR-SYSTEMS [J].
ECONOMOU, CG ;
MORARI, M ;
PALSSON, BO .
INDUSTRIAL & ENGINEERING CHEMISTRY PROCESS DESIGN AND DEVELOPMENT, 1986, 25 (02) :403-411
[6]  
Efron B., 1993, INTRO BOOSTRAP
[7]  
Fahlman S. E., 1990, ADV NEURAL INFORMATI, P524, DOI DOI 10.1190/1.1821929
[8]   MULTIVARIATE ADAPTIVE REGRESSION SPLINES [J].
FRIEDMAN, JH .
ANNALS OF STATISTICS, 1991, 19 (01) :1-67
[9]  
Fukunaga K., 1990, INTRO STAT PATTERN R
[10]   NEURAL NETWORK ENSEMBLES [J].
HANSEN, LK ;
SALAMON, P .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1990, 12 (10) :993-1001