DECISION THEORETIC GENERALIZATIONS OF THE PAC MODEL FOR NEURAL NET AND OTHER LEARNING APPLICATIONS

被引:470
作者
HAUSSLER, D
机构
[1] Baskin Center for Computer Engineering and Information Sciences, University of California, Santa Cruz
关键词
D O I
10.1016/0890-5401(92)90010-D
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe a generalization of the PAC learning model that is based on statistical decision theory. In this model the learner receives randomly drawn examples, each example consisting of an instance x ∈ X and an outcome y ∈ Y, and tries to find a decision rule h: X → A, where h ∈ H, that specifies the appropriate action a ∈ A to take for each instance x in order to minimize the expectation of a loss l(y, a). Here X, Y, and A are arbitrary sets, l is a real-valued function, and examples are generated according to an arbitrary joint distribution on X × Y. Special cases include the problem of learning a function from X into Y, the problem of learning the conditional probability distribution on Y given X (regression), and the problem of learning a distribution on X (density estimation). We give theorems on the uniform convergence of empirical loss estimates to true expected loss rates for certain decision rule spaces H, and show how this implies learnability with bounded sample size, disregarding computational complexity. As an application, we give distribution-independent upper bounds on the sample size needed for learning with feedforward neural networks. Our theorems use a generalized notion of VC dimension that applies to classes of real-valued functions, adapted from Vapnik and Pollard's work, and a notion of capacity and metric dimension for classes of functions that map into a bounded metric space. © 1992.
引用
收藏
页码:78 / 150
页数:73
相关论文
共 97 条
[1]  
ABE N, 1990, 3RD P WORKSH COMP LE, P52
[2]   RATES OF GROWTH AND SAMPLE MODULI FOR WEIGHTED EMPIRICAL PROCESSES INDEXED BY SETS [J].
ALEXANDER, KS .
PROBABILITY THEORY AND RELATED FIELDS, 1987, 75 (03) :379-423
[3]  
ALEXANDER KS, 1985, P BERK C HON JERZ NE, V2, P475
[4]  
Angluin D., 1988, Machine Learning, V2, P343, DOI 10.1007/BF00116829
[5]  
Angluin D., 1988, Machine Learning, V2, P319, DOI 10.1023/A:1022821128753
[6]   FAST PROBABILISTIC ALGORITHMS FOR HAMILTONIAN CIRCUITS AND MATCHINGS [J].
ANGLUIN, D ;
VALIANT, LG .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1979, 18 (02) :155-193
[7]  
ANTHONY M, 1990, CSDTR628 U LOND TECH
[8]   DENSITY AND DIMENSION [J].
ASSOUAD, P .
ANNALES DE L INSTITUT FOURIER, 1983, 33 (03) :233-282
[9]  
BARRON A, 1990, IN PRESS IEEE T INFO
[10]  
BARRON A, 1989, 28TH C DEC CONTR, P280