Application of distributed SVM architectures in classifying forest data cover types

被引:24
作者
Trebar, Mira [1 ]
Steele, Nigel [2 ]
机构
[1] Univ Ljubljana, Fac Comp & Informat Sci, Ljubljana 1000, Slovenia
[2] Coventry Univ, Dept Math Sci, Coventry CV1 5FB, W Midlands, England
关键词
support vector machine; classification; distributed architecture; imbalanced data; training subsets;
D O I
10.1016/j.compag.2008.02.001
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
In many 'real-world' applications, a classification of large data sets, which are often also imbalanced, is difficult due to the small, but usually more interesting classes. In this study, a large data set, forest cover type classes, which is actually multi-class classification defined with seven imbalanced classes and used as a resource inventory information was analyzed and evaluated. The data set was transformed into seven new data sets and a support vector machine (SVM) was employed to solve a binary classification problem of balanced and imbalanced data sets with various sizes. in the two approaches considered, the use of distributed SVM architectures, which basically reduces the complexity of the quadratic optimization problem of very large data sets, and the use of two sampling approaches for classification of imbalanced data sets were combined and results presented. The experimental results of distributed SVM architectures show the improvement of the accuracy for larger data sets in comparison to a single SVM classifier and their ability to improve the correct classification of the minority class. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:119 / 130
页数:12
相关论文
共 17 条
  • [1] Batista G.E.A.P.A., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI [10.1145/1007730.1007735, DOI 10.1145/1007730.1007735]
  • [2] Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables
    Blackard, JA
    Dean, DJ
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 1999, 24 (03) : 131 - 151
  • [3] CHAWLA NV, 2003, SIGKDD EXPLORATIONS, V6, P1
  • [4] CHEN XW, 2005, IEEE P INT JOINT C N
  • [5] A parallel mixture of SVMs for very large scale problems
    Collobert, R
    Bengio, S
    Bengio, Y
    [J]. NEURAL COMPUTATION, 2002, 14 (05) : 1105 - 1114
  • [6] GRAF TP, 2005, 19 ANN C NEUR INF PR
  • [7] JOACHIMS T, 2004, SVMLIGHT SUPPORT VEC
  • [8] Joachims T., 2002, Learning to classify text using support vector machines
  • [9] Kecman V., 2001, LEARNING SOFT COMPUT
  • [10] Constructing support vector machine ensemble
    Kim, HC
    Pang, S
    Je, HM
    Kim, D
    Bang, SY
    [J]. PATTERN RECOGNITION, 2003, 36 (12) : 2757 - 2767