NEURAL NET ALGORITHMS THAT LEARN IN POLYNOMIAL-TIME FROM EXAMPLES AND QUERIES

被引：82

作者：

BAUM, EB

机构：

[1] NEC Research Institute, Princeton, NJ 08540

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 1991年 / 2卷 / 01期

关键词：

D O I：

10.1109/72.80287

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Back propagation and many other neural net algorithms use a data base of labeled examples to train neural networks. Here examples are pairs (x, t(x)), where x is an input vector and t(x) is the value of the target function for input x. We propose an algorithm which trains networks using examples and queries. In a query [1] the algorithm supplies a y and is told t(y) by an oracle. Queries appear to be available in practice for most problems of interest, e.g. by appeal to a human expert. Our algorithm is proved to PAC learn in polynomial time the class of target functions defined by layered, depth 2, threshold nets having n inputs connected to k hidden threshold units connected to one or more output units, provided k less-than-or-equal-to 4. While target functions and input distributions can be described for which the algorithm will fail for larger k, it appears likely to work well "in practice." Tests of a variant of the algorithm have consistently and rapidly learned random nets of this type with n = 200 and k = 200 for examples chosen from a uniform distribution. Alternatively, the algorithm can learn the parity function in n = 200 dimensions to an accuracy of 99.8% in 30 min of SGI workstation CPU time. The algorithm is quite efficient, training a net no larger than the target net, using only O (nk) queries and approximately O (k-epsilon-1) examples, and using time, arguably near optimal, expected to be O (kn3 + nk2-epsilon-1 + k2-epsilon-3). For comparison note that an algorithm training a net of this size using only examples would need OMEGA(kn-epsilon-1) examples, and to perform one forward pass would take time OMEGA(k2n2-epsilon-1). The algorithm can also be proved to learn intersections of k half-spaces in R(n) in time polynomial in both n and k. A variant of the algorithm can learn arbitrary depth layered threshold networks with n inputs and k units in the first hidden layer in time polynomial in the larger of n and k but exponential in the smaller of the two.

引用

页码：5 / 19

页数：15

共 38 条

[1] ACKLEY DH, 1985, COGNITIVE SCI, V9, P147
[2] Angluin D., 1988, Machine Learning, V2, P319, DOI 10.1023/A:1022821128753
[3] FAST PROBABILISTIC ALGORITHMS FOR HAMILTONIAN CIRCUITS AND MATCHINGS
ANGLUIN, D
VALIANT, LG
[J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1979, 18 (02) : 155 - 193
[4] Baum E. B., 1990, Journal of Complexity, V6, P67, DOI 10.1016/0885-064X(90)90012-3
[5] BAUM EB, 1990 EURASIP WORKSH, P2
[6] BAUM EB, IN PRESS NEURAL COMP
[7] What Size Net Gives Valid Generalization?
Baum, Eric B.
Haussler, David
[J]. NEURAL COMPUTATION, 1989, 1 (01) : 151 - 160
[8] The Perceptron Algorithm is Fast for Nonmalicious Distributions
Baum, Eric B.
[J]. NEURAL COMPUTATION, 1990, 2 (02) : 248 - 260
[9] BECKER S, 1988, 1988 P CONN MOD SUMM
[10] Blum A., 1989, ADV NEURAL INFORMATI, P494

← 1 2 3 4 →