On the difficulty of approximately maximizing agreements

被引:64
作者
Ben-David, S
Eiron, N
Long, PM
机构
[1] Genome Inst Singapore, Singapore 117604, Singapore
[2] Technion Israel Inst Technol, Dept Comp Sci, IL-32000 Haifa, Israel
[3] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
关键词
machine learning; computational learning theory; neural networks; inapproximability; hardness; half-spaces; axis-aligned hyper rectangles; balls; monomials;
D O I
10.1016/S0022-0000(03)00038-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We address the computational complexity of learning in the agnostic framework. For a variety of common concept classes we prove that, unless P = NP, there is no polynomial time approximation scheme for finding a member in the class that approximately maximizes the agreement with a given training sample. In particular our results apply to the classes of monomials, axis-aligned hyper-rectangles, closed balls and monotone monomials. For each of these classes, we prove the NP-hardness of approximating maximal agreement to within some fixed constant (independent of the sample size and of the dimensionality of the sample space). For the class of half-spaces, we prove that, for any epsilon > 0, it is NP-hard to approximately maximize agreements to within a factor of (418/415 - epsilon), improving on the best previously known constant for this problem, and using a simpler proof. An interesting feature of our proofs is that, for each of the classes we discuss, we find patterns of training examples that, while being hard for approximating agreement within that concept class, allow efficient agreement maximization within other concept classes. These results bring up a new aspect of the model selection problem-they imply that the choice of hypothesis class for agnostic learning from among those considered in this paper can drastically effect the computational complexity of the learning process. (C) 2003 Elsevier Science (USA). All rights reserved.
引用
收藏
页码:496 / 514
页数:19
相关论文
共 24 条
[1]   THE COMPLEXITY AND APPROXIMABILITY OF FINDING MAXIMUM FEASIBLE SUBSYSTEMS OF LINEAR RELATIONS [J].
AMALDI, E ;
KANN, V .
THEORETICAL COMPUTER SCIENCE, 1995, 147 (1-2) :181-210
[2]  
Angluin D., 1988, Machine Learning, V2, P343, DOI 10.1007/BF00116829
[3]  
[Anonymous], J COMPUT SYST SCI, DOI DOI 10.1006/JCSS.1997.1504
[4]   The hardness of approximate optima in lattices, codes, and systems of linear equations [J].
Arora, S ;
Babai, L ;
Stern, J ;
Sweedyk, Z .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 54 (02) :317-331
[5]  
BARTLETT P, 1999, 1999 IMA EUR C COMP, P50
[6]  
BENDAVID S, 2000, IN PRESS HARDNESS UN
[7]   TRAINING A 3-NODE NEURAL NETWORK IS NP-COMPLETE [J].
BLUM, AL ;
RIVEST, RL .
NEURAL NETWORKS, 1992, 5 (01) :117-127
[8]   LEARNABILITY AND THE VAPNIK-CHERVONENKIS DIMENSION [J].
BLUMER, A ;
EHRENFEUCHT, A ;
HAUSSLER, D ;
WARMUTH, MK .
JOURNAL OF THE ACM, 1989, 36 (04) :929-965
[9]   ON THE COMPLEXITY OF TRAINING NEURAL NETWORKS WITH CONTINUOUS ACTIVATION FUNCTIONS [J].
DASGUPTA, B ;
SIEGELMANN, HT ;
SONTAG, E .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1995, 6 (06) :1490-1504
[10]   Computing the maximum bichromatic discrepancy, with applications to computer graphics and machine learning [J].
Dobkin, DP ;
Gunopulos, D ;
Maass, W .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1996, 52 (03) :453-470