Reducing multiclass to binary: A unifying approach for margin classifiers

被引:849
作者
Allwein, EL
Schapire, RE
Singer, Y
机构
[1] SW Res Inst, San Antonio, TX 78228 USA
[2] AT&T Labs Res, Shannon Lab, Florham Pk, NJ 07932 USA
[3] Hebrew Univ Jerusalem, Sch Engn & Comp Sci, IL-91904 Jerusalem, Israel
关键词
D O I
10.1162/15324430152733133
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present a unifying framework for studying the solution of multiclass categorization problems by reducing them to multiple binary problems that are then solved using a margin-based binary learning algorithm. The proposed framework unifies some of the most popular approaches in which each class is compared against all others, or in which all pairs of classes are compared to each other, or in which output codes with error-correcting properties are used. We propose a general method for combining the classifiers generated on the binary problems, and we prove a general empirical multiclass loss bound given the empirical loss of the individual binary learning algorithms. The scheme and the corresponding bounds apply to many popular classification learning algorithms including support-vector machines, AdaBoost, regression, logistic regression and decision-tree algorithms. We also give a multiclass generalization error analysis for general output codes with AdaBoost as the binary learner. Experimental results with SVM and AdaBoost show that our scheme provides a viable alternative to the most commonly used multiclass algorithms.
引用
收藏
页码:113 / 141
页数:29
相关论文
共 28 条
  • [1] [Anonymous], 1998, UCI REPOSITORY MACHI
  • [2] The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network
    Bartlett, PL
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) : 525 - 536
  • [3] BREIMAN L, 1997, 504 U CAL STAT DEP
  • [4] Breiman L., 1984, BIOMETRICS, DOI DOI 10.2307/2530946
  • [5] Breiman Leo, 1997, Arcing the edge
  • [6] COLLINS M, 2000, P 13 ANN C COMP LEAR
  • [7] CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
  • [8] CRAMMER K, 2000, P 13 ANN C COMP LEAR
  • [9] Csiszar I., 1984, STATISTICS DECISIO S, V1, P205
  • [10] Inducing features of random fields
    DellaPietra, S
    DellaPietra, V
    Lafferty, J
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (04) : 380 - 393