Massive data discrimination via linear support vector machines

被引:126
作者
Bradley, PS
Mangasarian, OL
机构
[1] Microsoft Corp, Res, Redmond, WA 98052 USA
[2] Univ Wisconsin, Dept Comp Sci, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
support vector machines; linear programming chunking;
D O I
10.1080/10556780008805771
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A linear support vector machine formulation is used to generate a fast, finitely-terminating linear-programming algorithm for discriminating between two massive sets in n-dimensional space, where the number of points can be orders of magnitude larger than n. The algorithm creates a succession of sufficiently small linear programs that separate I:hunks of the data at a time. The key idea is that a small number of support vectors, corresponding to linear programming constraints with positive dual variables, are carried over between the successive small linear programs, each of which containing a chunk of the data. We prove that this procedure is monotonic and terminates in a finite number of steps at an exact solution that leads to an optimal separating plane for the entire dataset. Numerical results on fully dense publicly available datasets, numbering 20,000 to 1 million points in 32-dimensional space, confirm the theoretical results and demonstrate the ability to handle very large problems.
引用
收藏
页码:1 / 10
页数:10
相关论文
共 22 条
  • [11] Luenberger D.G., 1984, LINEAR NONLINEAR PRO
  • [12] BREAST-CANCER DIAGNOSIS AND PROGNOSIS VIA LINEAR-PROGRAMMING
    MANGASARIAN, OL
    STREET, WN
    WOLBERG, WH
    [J]. OPERATIONS RESEARCH, 1995, 43 (04) : 570 - 577
  • [13] Arbitrary-norm separating plane
    Mangasarian, OL
    [J]. OPERATIONS RESEARCH LETTERS, 1999, 24 (1-2) : 15 - 23
  • [14] LINEAR AND NONLINEAR SEPARATION OF PATTERNS BY LINEAR PROGRAMMING
    MANGASARIAN, OL
    [J]. OPERATIONS RESEARCH, 1965, 13 (03) : 444 - +
  • [15] Mangasarian OL., 1994, NONLINEAR PROGRAMMIN, DOI [DOI 10.1137/1.9781611971255, 10.1137/1.9781611971255]
  • [16] MELLI G, 1997, SYNTHETIC CLASSIFICA
  • [17] MURTAGH BA, 1992, MINOS 5 0 USERS GUID
  • [18] Murty KG., 1983, LINEAR PROGRAMMING
  • [19] Training support vector machines: an application to face detection
    Osuna, E
    Freund, R
    Girosi, F
    [J]. 1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, : 130 - 136
  • [20] An improved training algorithm for support vector machines
    Osuna, E
    Freund, R
    Girosi, F
    [J]. NEURAL NETWORKS FOR SIGNAL PROCESSING VII, 1997, : 276 - 285