NEURAL NET ALGORITHMS THAT LEARN IN POLYNOMIAL-TIME FROM EXAMPLES AND QUERIES

被引:82
作者
BAUM, EB
机构
[1] NEC Research Institute, Princeton, NJ 08540
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1991年 / 2卷 / 01期
关键词
D O I
10.1109/72.80287
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Back propagation and many other neural net algorithms use a data base of labeled examples to train neural networks. Here examples are pairs (x, t(x)), where x is an input vector and t(x) is the value of the target function for input x. We propose an algorithm which trains networks using examples and queries. In a query [1] the algorithm supplies a y and is told t(y) by an oracle. Queries appear to be available in practice for most problems of interest, e.g. by appeal to a human expert. Our algorithm is proved to PAC learn in polynomial time the class of target functions defined by layered, depth 2, threshold nets having n inputs connected to k hidden threshold units connected to one or more output units, provided k less-than-or-equal-to 4. While target functions and input distributions can be described for which the algorithm will fail for larger k, it appears likely to work well "in practice." Tests of a variant of the algorithm have consistently and rapidly learned random nets of this type with n = 200 and k = 200 for examples chosen from a uniform distribution. Alternatively, the algorithm can learn the parity function in n = 200 dimensions to an accuracy of 99.8% in 30 min of SGI workstation CPU time. The algorithm is quite efficient, training a net no larger than the target net, using only O (nk) queries and approximately O (k-epsilon-1) examples, and using time, arguably near optimal, expected to be O (kn3 + nk2-epsilon-1 + k2-epsilon-3). For comparison note that an algorithm training a net of this size using only examples would need OMEGA(kn-epsilon-1) examples, and to perform one forward pass would take time OMEGA(k2n2-epsilon-1). The algorithm can also be proved to learn intersections of k half-spaces in R(n) in time polynomial in both n and k. A variant of the algorithm can learn arbitrary depth layered threshold networks with n inputs and k units in the first hidden layer in time polynomial in the larger of n and k but exponential in the smaller of the two.
引用
收藏
页码:5 / 19
页数:15
相关论文
共 38 条