Sparse Kernel-Based Ensemble Learning With Fully Optimized Kernel Parameters for Hyperspectral Classification Problems

被引:25
作者
Gurram, Prudhvi [1 ]
Kwon, Heesung [1 ]
机构
[1] USA, Res Lab, Adelphi, MD 20783 USA
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2013年 / 51卷 / 02期
关键词
Chemical plume detection; ensemble learning; kernel parameter optimization; sparse kernel learning; support vector machine (SVM); SUPPORT VECTOR MACHINES;
D O I
10.1109/TGRS.2012.2203603
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Recently, a kernel-based ensemble learning technique for hyperspectral detection/classification problems has been introduced by the authors, to provide robust classification over hyperspectral data with relatively high level of noise and background clutter. The kernel-based ensemble technique first randomly selects spectral feature subspaces from the input data. Each individual classifier, which is in fact a support vector machine (SVM), then independently conducts its own learning within its corresponding spectral feature subspace and hence constitutes a weak classifier. The decisions from these weak classifiers are equally or adaptively combined to generate the final ensemble decision. However, in such ensemble learning, little attempt has been previously made to jointly optimize the weak classifiers and the aggregating process for combining the subdecisions. The main goal of this paper is to achieve an optimal sparse combination of the subdecisions by jointly optimizing the separating hyperplane obtained by optimally combining the kernel matrices of the SVM classifiers and the corresponding weights of the subdecisions required for the aggregation process. Sparsity is induced by applying an l1 norm constraint on the weighting coefficients. Consequently, the weights of most of the subclassifiers become zero after the optimization, and only a few of the subclassifiers with non-zero weights contribute to the final ensemble decision. Moreover, in this paper, an algorithm to determine the optimal full-diagonal bandwidth parameters of the Gaussian kernels of the individual SVMs is also presented by minimizing the radius-margin bound. The optimized full-diagonal bandwidth Gaussian kernels are used by the sparse SVM ensemble to perform binary classification. The performance of the proposed technique with optimized kernel parameters is compared to that of the one with single-bandwidth parameter obtained using cross-validation by testing them on various data sets. On an average, the proposed sparse kernel-based ensemble learning algorithm with optimized full-diagonal bandwidth parameters shows an improvement of 20% over the existing ensemble learning techniques.
引用
收藏
页码:787 / 802
页数:16
相关论文
共 33 条
[1]  
[Anonymous], 2004, Modern Spectroscopy
[2]  
Bishop C., 2006, PATTERN RECOGN, DOI DOI 10.1117/1.2819119
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1023/A:1018054314350
[5]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[6]   Kernel-based methods for hyperspectral image classification [J].
Camps-Valls, G ;
Bruzzone, L .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2005, 43 (06) :1351-1362
[7]  
Camps-Valls G., 2002, IEEE T NEURAL NETWOR, V13, P93
[8]   Choosing multiple parameters for support vector machines [J].
Chapelle, O ;
Vapnik, V ;
Bousquet, O ;
Mukherjee, S .
MACHINE LEARNING, 2002, 46 (1-3) :131-159
[9]  
Chapelle O., SELECTION KERNEL PAR
[10]  
Chong E., 2001, INTRO OPTIMIZATION