Strong minimax lower bounds for learning

被引:21
作者
Antos, A
Lugosi, G
机构
[1] Tech Univ Budapest, Fac Elect Engn, Dept Math & Comp Sci, H-1521 Budapest, Hungary
[2] Pompeu Fabra Univ, Dept Econ, Barcelona 08005, Spain
关键词
Vapnik-Chervonenkis dimension; PAC learning; lower bounds;
D O I
10.1023/A:1007454427662
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Minimax lower bounds for concept learning state, for example, that for each sample size n and learning rule g(n), there exists a distribution of the observation X and a concept C to be learnt such that the expected error of g(n) is at least a constant times V/n, where V is the VC dimension of the concept class. However, these bounds do not tell anything about the rate of decrease of the error for a fixed distribution-concept pair. In this paper we investigate minimax lower bounds in such a (stronger) sense. We show that for several natural k-parameter concept classes, including the class of linear halfspaces, the class of balls, the class of polyhedra with a certain number of faces, and a class of neural networks, for any sequence of learning rules {g(n)}, there exists a fixed distribution of X and a fixed concept C such that the expected error is larger than a constant times k/n for infinitely many n. We also obtain such strong minimax lower bounds for the tail distribution of the probability of error, which extend the corresponding minimax lower bounds.
引用
收藏
页码:31 / 56
页数:26
相关论文
共 12 条
[1]  
Antos A., 1996, Proceedings of the Ninth Annual Conference on Computational Learning Theory, P303, DOI 10.1145/238061.238160
[2]   DENSITY AND DIMENSION [J].
ASSOUAD, P .
ANNALES DE L INSTITUT FOURIER, 1983, 33 (03) :233-282
[3]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965
[4]   LOWER BOUNDS IN PATTERN-RECOGNITION AND LEARNING [J].
DEVROYE, L ;
LUGOSI, G .
PATTERN RECOGNITION, 1995, 28 (07) :1011-1018
[5]  
Devroye L., 2013, A probabilistic theory of pattern recognition, V31
[6]   A GENERAL LOWER BOUND ON THE NUMBER OF EXAMPLES NEEDED FOR LEARNING [J].
EHRENFEUCHT, A ;
HAUSSLER, D ;
KEARNS, M ;
VALIANT, L .
INFORMATION AND COMPUTATION, 1989, 82 (03) :247-261
[7]   PREDICTING (0,1)-FUNCTIONS ON RANDOMLY DRAWN POINTS [J].
HAUSSLER, D ;
LITTLESTONE, N ;
WARMUTH, MK .
INFORMATION AND COMPUTATION, 1994, 115 (02) :248-292
[8]   IMPROVED UPPER-BOUNDS FOR PROBABILITIES OF UNIFORM DEVIATIONS [J].
LUGOSI, G .
STATISTICS & PROBABILITY LETTERS, 1995, 25 (01) :71-77
[9]  
Reiss R.-D., 1989, APPROXIMATE DISTRIBU
[10]  
Schuurmans D., 1995, Computational Learning Theory. Second European Conference, EuroCOLT '95. Proceedings, P272