Efficient noise-tolerant learning from statistical queries

被引:360
作者
Kearns, M [1 ]
机构
[1] AT&T Bell Labs, Res, Florham Pk, NJ 07932 USA
关键词
computational learning theory; machine learning;
D O I
10.1145/293347.293351
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the problem of learning in the presence of classification noise in the probabilistic learning model of Valiant and its variants. In order to identify the class of "robust" learning algorithms in the most general way, we formalize a new but related model of learning from statistical queries. Intuitively, in this model, a learning algorithm is forbidden to examine individual examples of the unknown target function, but is given access to an oracle providing estimates of probabilities over the sample space of random examples. One of our main results shows that any class of functions learnable from statistical queries is in fact learnable with classification noise in Valiant's model, with a noise rate approaching the information-theoretic barrier of 1/2. We then demonstrate the generality of the statistical query model, showing that practically every class learnable in Valiant's model and its variants can also be learned in the new model (and thus can be learned in the presence of noise). A notable exception to this statement is the class of parity functions, which we prove is not learnable from statistical queries, and for which no noise-tolerant algorithm is known.
引用
收藏
页码:983 / 1006
页数:24
相关论文
共 34 条
[1]  
Angluin D., 1988, Machine Learning, V2, P343, DOI 10.1007/BF00116829
[2]  
[Anonymous], 1985, P 9 INT JOINT C ART
[3]  
[Anonymous], 1988, Proceedings of the first annual workshop on computational learning theory
[4]  
[Anonymous], P 1 AN WORKSH COMP L
[5]  
Aslam J. A., 1995, Proceedings of the Eighth Annual Conference on Computational Learning Theory, P437, DOI 10.1145/225298.225351
[6]  
Aslam J. A., 1993, Proceedings. 34th Annual Symposium on Foundations of Computer Science (Cat. No.93CH3368-8), P282, DOI 10.1109/SFCS.1993.366859
[7]   The Transition to Perfect Generalization in Perceptrons [J].
Baum, Eric B. ;
Lyuu, Yuh-Dauh .
NEURAL COMPUTATION, 1991, 3 (03) :386-401
[8]  
Blum A., 1994, Proceedings of the Twenty-Sixth Annual ACM Symposium on the Theory of Computing, P253, DOI 10.1145/195058.195147
[9]   A polynomial-time algorithm for learning noisy linear threshold functions [J].
Blum, A ;
Frieze, A ;
Kannan, R ;
Vempala, S .
37TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 1996, :330-338
[10]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965