ERROR SURFACES FOR MULTILAYER PERCEPTRONS

被引:34
作者
HUSH, DR
HORNE, B
SALAS, JM
机构
[1] Department of Electrical Engineering and Cornputer Engineering, University of New Mexico, Albuquerque, NM
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS | 1992年 / 22卷 / 05期
关键词
D O I
10.1109/21.179853
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The paper explores the characteristics of error surfaces for the multilayer perceptron neural network. These characteristics help explain why learning techniques that use hill climbing methods are so slow in these networks. They also help provide insights into techniques that may help speed learning. Several important characteristics are revealed. First, the surface has a stair-step appearance with many very flat and very steep regions. In fact, when the number of training samples is small there is often a one-to-one correspondence between individual training samples and the steps on the surface. As the number of training samples is increased the surface becomes smoother. In addition the surface has flat regions that extend to infinity in all directions making it dangerous to apply learning algorithms that perform line searches. The magnitude of gradients on the surface is found to span several orders of magnitude, strongly supporting the need for floating point representations during learning. The consequences of various weight initialization techniques are also discussed.
引用
收藏
页码:1152 / 1161
页数:10
相关论文
共 13 条
[1]  
[Anonymous], 1987, LEARNING INTERNAL RE
[2]  
BLUM A, 1988, 1ST P WORKSH COMP LE, P9
[3]  
BRADY M, 1988, P IEEE INT C NEUR NE, V1, P649
[4]   ON THE APPROXIMATE REALIZATION OF CONTINUOUS-MAPPINGS BY NEURAL NETWORKS [J].
FUNAHASHI, K .
NEURAL NETWORKS, 1989, 2 (03) :183-192
[5]  
Huang W.Y., 1988, NEURAL INFORM PROCES, P387
[6]  
Irie B., 1988, IEEE INT C NEURAL NE, V1, P641
[7]  
Judd S., 1988, Journal of Complexity, V4, P177, DOI 10.1016/0885-064X(88)90019-2
[8]  
LARINAJAFI H, 1989, 1989 P IEEE INT C SY, V1, P218
[9]  
Lippmann R. P., 1988, Computer Architecture News, V16, P7, DOI [10.1109/MASSP.1987.1165576, 10.1145/44571.44572]
[10]  
PARKER DB, 1985, MIT TR47 CTR COMP RE