Ensembling neural networks: Many could be better than all

被引:1398
作者
Zhou, ZH [1 ]
Wu, JX [1 ]
Tang, W [1 ]
机构
[1] Nanjing Univ, Natl Lab Novel Software Technol, Nanjing 210093, Peoples R China
关键词
neural networks; neural network ensemble; machine learning; selective ensemble; boosting; bagging; genetic algorithm; bias-variance decomposition;
D O I
10.1016/S0004-3702(02)00190-X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural network ensemble is a learning paradigm where many neural networks are jointly used to solve a problem. In this paper, the relationship between the ensemble and its component neural networks is analyzed from the context of both regression and classification, which reveals that it may be better to ensemble many instead of all of the neural networks at hand. This result is interesting because at present, most approaches ensemble all the available neural networks for prediction. Then, in order to show that the appropriate neural networks for composing an ensemble can be effectively selected from a set of available neural networks, an approach named GASEN is presented. GASEN trains a number of neural networks at first. Then it assigns random weights to those networks and employs genetic algorithm to evolve the weights so that they can characterize to some extent the fitness of the neural networks in constituting an ensemble. Finally it selects some neural networks based on the evolved weights to make up the ensemble. A large empirical study shows that, compared with some popular ensemble approaches such as Bagging and Boosting, GASEN can generate neural network ensembles with far smaller sizes but stronger generalization ability. Furthermore, in order to understand the working mechanism of GASEN, the bias-variance decomposition of the error is provided in this paper, which shows that the success of GASEN may lie in that it can significantly reduce the bias as well as the variance. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:239 / 263
页数:25
相关论文
共 47 条
[21]  
HANSEN JV, 2000, THESIS U AARHUS DENM
[22]   NEURAL NETWORK ENSEMBLES [J].
HANSEN, LK ;
SALAMON, P .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1990, 12 (10) :993-1001
[23]  
HANSEN LK, 1992, P IEEE WORKSH NEUR N, P333
[24]  
Houck C. R., 1995, Ncsu-ie tr, V95, P1
[25]   Adaptive Mixtures of Local Experts [J].
Jacobs, Robert A. ;
Jordan, Michael I. ;
Nowlan, Steven J. ;
Hinton, Geoffrey E. .
NEURAL COMPUTATION, 1991, 3 (01) :79-87
[26]  
Jimenez D, 1998, IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, P753, DOI 10.1109/IJCNN.1998.682375
[27]   HIERARCHICAL MIXTURES OF EXPERTS AND THE EM ALGORITHM [J].
JORDAN, MI ;
JACOBS, RA .
NEURAL COMPUTATION, 1994, 6 (02) :181-214
[28]  
Kohavi R., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P275
[29]  
Krogh A., 1995, Advances in Neural Information Processing Systems 7, P231
[30]  
Maclin R, 1995, INT JOINT CONF ARTIF, P524