The cascade-correlation learning: A projection pursuit learning perspective

被引:45
作者
Hwang, JN [1 ]
You, SS [1 ]
Lay, SR [1 ]
Jou, IC [1 ]
机构
[1] MINIST TRANSPORTAT & COMMUN, TELECOMMUN LABS, CHUNGLI 320, TAIWAN
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1996年 / 7卷 / 02期
基金
美国国家航空航天局; 美国国家科学基金会;
关键词
D O I
10.1109/72.485631
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cascade-correlation (Cascor) is a popular supervised learning architecture that dynamically grows layers of hidden neurons of fixed nonlinear activations (e.g., sigmoids), so that the network topology (size, depth) can be efficiently determined. Similar to a cascade-correlation learning network (CCLN), a projection pursuit learning network (PPLN) also dynamically grows the hidden neurons. Unlike a CCLN where cascaded connections from the existing hidden units to the new candidate hidden unit are required to establish high-order nonlinearity in approximating the residual error, a PPLN approximates the high-order nonlinearity by using trainable parametric or semiparametric nonlinear smooth activations based on minimum mean squared error criterion. An analysis is provided to show that the maximum correlation training criterion used in a CCLN tends to produce hidden units that saturate and thus makes it more suitable for classification tasks instead of regression tasks as evidenced in the simulation results. It is also observed that this critical weakness in CCLN can also potentially carry over to classification tasks, such as the two-spiral benchmark used in the original CCLN paper.
引用
收藏
页码:278 / 289
页数:12
相关论文
共 33 条
[21]  
LAY SR, 1994, P INT C NEUR NETW OR, P1325
[22]  
LeCun Y., 1990, Advances in neural information processing systems, P598
[23]   A CONVERGENCE THEOREM FOR SEQUENTIAL LEARNING IN 2-LAYER PERCEPTRONS [J].
MARCHAND, M ;
GOLEA, M ;
RUJAN, P .
EUROPHYSICS LETTERS, 1990, 11 (06) :487-492
[24]  
Parker DB., 1985, 47 MIT CTR COMP RES
[25]   A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH [J].
RISSANEN, J .
ANNALS OF STATISTICS, 1983, 11 (02) :416-431
[26]   A STOCHASTIC APPROXIMATION METHOD [J].
ROBBINS, H ;
MONRO, S .
ANNALS OF MATHEMATICAL STATISTICS, 1951, 22 (03) :400-407
[27]  
Rumelhart D. E., 1986, PARALLEL DISTRIBUTED, V1
[28]  
SHANNON D, 1990, NEURAL NETWORKS ROBO
[29]  
*STAT SCI INC, S PLUS US MAN VER 3
[30]  
WATROUS RL, 1987, P IEEE INT C NEURAL, V2, P619