HOW TIGHT ARE THE VAPNIK-CHERVONENKIS BOUNDS

被引:45
作者
COHN, D
TESAURO, G
机构
[1] UNIV WASHINGTON,DEPT COMP SCI & ENGN,SEATTLE,WA 98195
[2] IBM CORP,THOMAS J WATSON RES CTR,YORKTOWN HTS,NY 10598
关键词
D O I
10.1162/neco.1992.4.2.249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe a series of numerical experiments that measure the average generalization capability of neural networks trained on a variety of simple functions. These experiments are designed to test the relationship between average generalization performance and the worstcase bounds obtained from formal learning theory using the Vapnik-Chervonenkis (VC) dimension (Blumer et al. 1989; Haussler et al. 1990). Recent statistical learning theories (Tishby et al. 1989; Schwartz et al. 1990) suggest that surpassing these bounds might be possible if the spectrum of possible generalizations has a "gap" near perfect performance. We indeed find that, in some cases, the average generalization is significantly better than the VC bound: the approach to perfect performance is exponential in the number of examples m, rather than the 1/m result of the bound. However, in these cases, we have not found evidence of the gap predicted by the above statistical theories. In other cases, we do find the 1/m behavior of the VC bound, and in these cases, the numerical prefactor is closely related to the prefactor contained in the bound.
引用
收藏
页码:249 / 269
页数:21
相关论文
共 15 条
[1]  
AHMAD S, 1988, UILUENG881759 U ILL
[2]  
AHMAD S, 1988, 1988 P CONN MOD SUMM, P3
[3]  
Baum E. B., 1990, Neural Networks. EURASIP Workshop 1990 Proceedings, P2
[4]   What Size Net Gives Valid Generalization? [J].
Baum, Eric B. ;
Haussler, David .
NEURAL COMPUTATION, 1989, 1 (01) :151-160
[5]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965
[6]  
EHRENFEUCHT A, 1988, 1988 P WORKSH COMP L
[7]  
HAUSSLER D, 1990, UCSCCRL9054 U CAL SA
[8]  
HAUSSLER D, 1991, 4TH P ANN WORKSH COM
[9]  
Ruck D W, 1990, IEEE Trans Neural Netw, V1, P296, DOI 10.1109/72.80266
[10]  
RUMELHART DE, 1986, PARALLEL DISTRIBUTED, V1, P381