Cutting-plane training of structural SVMs

被引:37
作者
Joachims, Thorsten [1 ]
Finley, Thomas [1 ]
Yu, Chun-Nam John [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
Structural SVMs; Support vector machines; Structured output prediction; Training algorithms; SUPPORT VECTOR MACHINES;
D O I
10.1007/s10994-009-5108-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discriminative training approaches like structural SVMs have shown much promise for building highly complex and accurate models in areas like natural language processing, protein structure prediction, and information retrieval. However, current training algorithms are computationally expensive or intractable on large datasets. To overcome this bottleneck, this paper explores how cutting-plane methods can provide fast training not only for classification SVMs, but also for structural SVMs. We show that for an equivalent "1-slack" reformulation of the linear SVM training problem, our cutting-plane method has time complexity linear in the number of training examples. In particular, the number of iterations does not depend on the number of training examples, and it is linear in the desired precision and the regularization parameter. Furthermore, we present an extensive empirical evaluation of the method applied to binary classification, multi-class classification, HMM sequence tagging, and CFG parsing. The experiments show that the cutting-plane algorithm is broadly applicable and fast in practice. On large datasets, it is typically several orders of magnitude faster than conventional training methods derived from decomposition methods like SVM-light, or conventional cutting-plane methods. Implementations of our methods are available at http://www.joachims.org.
引用
收藏
页码:27 / 59
页数:33
相关论文
共 43 条
  • [1] Altun Y., 2003, P INT C MACHINE LEAR, P3
  • [2] Anguelov D, 2005, PROC CVPR IEEE, P169
  • [3] [Anonymous], 2001, Journal of Machine Learning Research
  • [4] [Anonymous], NEW DEV PARSING TECH
  • [5] [Anonymous], 2001, P 18 INT C MACH LEAR, DOI DOI 10.5555/645530.655813
  • [6] [Anonymous], 2000, P 17 INT C MACHINE L
  • [7] [Anonymous], 2007, ICML 07
  • [8] BARTLETT P, 2004, ADV NEURAL INFORM PR, P305
  • [9] Caruana R., 2004, ACM SIGKDD Explorations Newsletter, V6, P95, DOI [DOI 10.1145/1046456.1046470, 10.1145/1046456.1046470]
  • [10] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)