ON THE COMPLEXITY OF TRAINING NEURAL NETWORKS WITH CONTINUOUS ACTIVATION FUNCTIONS

被引:19
作者
DASGUPTA, B
SIEGELMANN, HT
SONTAG, E
机构
[1] UNIV MINNESOTA,DEPT COMP SCI,MINNEAPOLIS,MN 55455
[2] BAR ILAN UNIV,DEPT COMP SCI,IL-52900 RAMAT GAN,ISRAEL
[3] RUTGERS STATE UNIV,DEPT MATH,NEW BRUNSWICK,NJ 08903
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1995年 / 6卷 / 06期
基金
美国国家科学基金会;
关键词
D O I
10.1109/72.471360
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We deal with computational issues of loading a fixed-architecture neural network with a set of positive and negative examples. This is the first result on the hardness of loading a simple three-node architecture which does not consist of the binary-threshold neurons, but rather utilizes a particular continuous activation function, commonly used in the neural-network literature. We observe that the loading problem is polynomial-time if the input dimension is constant. Otherwise, however, any possible learning algorithm based on particular fixed architectures faces severe computational barriers. Similar theorems have already been proved by Megiddo and by Blum and Rivest, to the case of binary-threshold networks only, Our theoretical results lend further suggestion to the use of incremental (architecture-changing) techniques for training networks rather than fixed architectures. Furthermore, they imply hardness of learnability in the probably approximately correct sense as well.
引用
收藏
页码:1490 / 1504
页数:15
相关论文
共 27 条
[1]  
BARRON AR, 1991, 4TH P ANN WORKSH COM, P243
[2]   A MULTILAYER NEURAL NETWORK WITH PIECEWISE-LINEAR STRUCTURE AND BACK-PROPAGATION LEARNING [J].
BATRUNI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (03) :395-403
[3]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[4]  
BLUM A, 1992, NEURAL NETWORKS, V1, P117
[5]  
BROWN JR, 1988, 2ND P S FRONT MASS P, P43
[6]  
BRUCK J, 1990, J COMPLEXITY, V9, P129
[7]  
DARKEN C, 1993, 6TH P ANN ACM C COMP, P303
[8]  
DASGUPTA B, 1993, ADV NEURAL INFORMATI, V5, P615
[9]  
Garey M.R., 1979, COMPUTERS INTRACTABI, V174
[10]  
GILL J, 1977, SIAM J COMPUT, V7, P675