QUERY CONSTRUCTION, ENTROPY, AND GENERALIZATION IN NEURAL-NETWORK MODELS

被引:18
作者
SOLLICH, P
机构
[1] Department of Physics, University of Edinburgh, Kings Buildings, Edinburgh EH9 3JZ, Mayfield Road
来源
PHYSICAL REVIEW E | 1994年 / 49卷 / 05期
关键词
D O I
10.1103/PhysRevE.49.4637
中图分类号
O35 [流体力学]; O53 [等离子体物理学];
学科分类号
070204 ; 080103 ; 080704 ;
摘要
We study query construction, algorithms, which aim at improving the generalization ability of systems that learn from examples by choosing optimal, nonredundant training sets. We set up a general probabilistic framework for deriving such algorithms from the requirement of optimizing a suitable objective function; specifically, we consider the objective functions entropy (or information gain) and generalization error. For two learning scenarios, the high-low game and the linear perceptron, we evaluate the generalization performance obtained by applying the corresponding query construction algorithms and compare it to training on random examples. We find qualitative differences between the two scenarios due to the different structure of the underlying rules (nonlinear and ''noninvertible'' versus linear); in particular, for the linear perceptron, random examples lead to the same generalization ability as a sequence of queries in the limit of an infinite number of examples. We also investigate learning algorithms which are ill matched to the learning environment and find that, in this case, minimum entropy queries can in fact yield a lower generalization ability than random examples. Finally, we study the efficiency of single queries and its dependence on the learning history, i.e., on whether the previous training examples were generated randomly or by querying, and the difference between globally and locally optimal query construction.
引用
收藏
页码:4637 / 4651
页数:15
相关论文
共 30 条
[1]   STATISTICAL-THEORY OF LEARNING-CURVES UNDER ENTROPIC LOSS CRITERION [J].
AMARI, S ;
MURATA, N .
NEURAL COMPUTATION, 1993, 5 (01) :140-153
[2]  
[Anonymous], 1991, INTRO THEORY NEURAL, DOI DOI 10.1201/9780429499661
[3]   DEVELOPMENTS IN THE DESIGN OF EXPERIMENTS [J].
ATKINSON, AC .
INTERNATIONAL STATISTICAL REVIEW, 1982, 50 (02) :161-177
[4]   NEURAL NET ALGORITHMS THAT LEARN IN POLYNOMIAL-TIME FROM EXAMPLES AND QUERIES [J].
BAUM, EB .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (01) :5-19
[5]   ROBUST BAYES AND EMPIRICAL BAYES ANALYSIS WITH EPSILON-CONTAMINATED PRIORS [J].
BERGER, J ;
BERLINER, LM .
ANNALS OF STATISTICS, 1986, 14 (02) :461-486
[6]  
BRIDLE JS, 1989, NEUROCOMPUTING ALGOR, P227
[7]   UNCERTAINTY, INFORMATION, AND SEQUENTIAL EXPERIMENTS [J].
DEGROOT, MH .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02) :404-&
[8]   LEARNING AND GENERALIZATION IN A LINEAR PERCEPTRON STOCHASTICALLY TRAINED WITH NOISY DATA [J].
DUNMUR, AP ;
WALLACE, DJ .
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1993, 26 (21) :5767-5779
[9]  
Feller W., 1970, INTRO PROBABILITY TH, V1
[10]   RECENT ADVANCES IN NONLINEAR EXPERIMENTAL-DESIGN [J].
FORD, I ;
TITTERINGTON, DM ;
KITSOS, CP .
TECHNOMETRICS, 1989, 31 (01) :49-60