Optimal ensemble averaging of neural networks

被引：157

作者：

Naftaly, U ^{[1
]}

Intrator, N ^{[1
]}

Horn, D ^{[1
]}

机构：

[1] TEL AVIV UNIV,RAYMOND & BEVERLY SACKLER FAC EXACT SCI,SCH MATH SCI,IL-69978 TEL AVIV,ISRAEL

来源：

NETWORK-COMPUTATION IN NEURAL SYSTEMS | 1997年 / 8卷 / 03期

关键词：

D O I：

10.1088/0954-898X/8/3/004

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Based on an observation about the different effect of ensemble averaging on the bias and variance portions of the prediction error, we discuss training methodologies for ensembles of networks. We demonstrate the effect of variance reduction and present a method of extrapolation to the limit of an infinite ensemble. A significant reduction of variance is obtained by averaging just over initial conditions of the neural networks, without varying architectures or training sets. The minimum of the ensemble prediction error is reached later than that of a single network. In the vicinity of the minimum, the ensemble prediction error appears to be flatter than that of the single network, thus simplifying optimal stopping decision. The results are demonstrated on sunspots data, where the predictions are among the best obtained, and on the 1993 energy prediction competition data set B.

引用

页码：283 / 296

页数：14

共 19 条

[1]

[Anonymous], 2018, TIME SERIES PREDICTI

[2] FINDING STRUCTURE IN TIME [J].

ELMAN, JL .

COGNITIVE SCIENCE, 1990, 14 (02) :179-211

[3] LEARNING THE HIDDEN STRUCTURE OF SPEECH [J].

ELMAN, JL ;

ZIPSER, D .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 83 (04) :1615-1626

[4] NEURAL NETWORKS AND THE BIAS VARIANCE DILEMMA [J].

GEMAN, S ;

BIENENSTOCK, E ;

DOURSAT, R .

NEURAL COMPUTATION, 1992, 4 (01) :1-58

[5]

HERTZ J, 1991, LECT NOTES SANTA FE, V1

[6]

HINTON GE, 1986, P 8 ANN C COGN SCI S, P12

[7]

LINCOLN WP, 1990, NEURAL INFORMATION P, V2, P650

[8]

MacKay D, 1994, ASHRAE T, V100, P1053

[9]

MORRIS J, 1977, J R STAT SOC A, V140, P437

[10] SIMPLIFYING NEURAL NETWORKS BY SOFT WEIGHT-SHARING [J].

NOWLAN, SJ ;

HINTON, GE .

NEURAL COMPUTATION, 1992, 4 (04) :473-493

← 1 2 →