Non-Parametric Kernel Learning with robust pairwise constraints

被引:15
作者
Chen, Changyou [1 ]
Zhang, Junping [2 ,3 ]
He, Xuefang [4 ]
Zhou, Zhi-Hua [5 ]
机构
[1] Australian Natl Univ, Res Sch Informat Sci & Engn, Canberra, ACT, Australia
[2] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China
[3] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[4] Beijing Univ Aeronaut & Astronaut, Sch Software & Informat Engn, Beihai, Peoples R China
[5] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210008, Jiangsu, Peoples R China
关键词
Kernel learning; Semi-definitive programming; Graph embedding; Pairwise constraint; Semi-supervised learning; ALGORITHMS;
D O I
10.1007/s13042-011-0048-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For existing kernel learning based semi-supervised clustering algorithms, it is generally difficult to scale well with large scale datasets and robust pairwise constraints. In this paper, we propose a new Non-Parametric Kernel Learning (NPKL) framework to deal with these problems. We generalize the graph embedding framework into kernel learning, by reforming it as a semi-definitive programming (SDP) problem, smoothing and avoiding over-smoothing the functional Hilbert space with Laplacian regularization. We propose two algorithms to solve this problem. One is a straightforward algorithm using SDP to solve the original kernel learning problem, dented as TRAnsductive Graph Embedding Kernel (TRAGEK) learning; the other is to relax the SDP problem and solve it with a constrained gradient descent algorithm. To accelerate the learning speed, we further divide the data into groups and used the sub-kernels of these groups to approximate the whole kernel matrix. This algorithm is denoted as Efficient Non-PArametric Kernel Learning (ENPAKL). The advantages of the proposed NPKL framework are (1) supervised information in the form of pairwise constraints can be easily incorporated; (2) it is robust to the number of pairwise constraints, i.e., the number of constraints does not affect the running time too much; (3) ENPAKL is efficient to some extent compared to some related kernel learning algorithms since it is a constraint gradient descent based algorithm. Experiments for clustering based on the learned kernels show that the proposed framework scales well with the size of datasets and the number of pairwise constraints. Further experiments for image segmentation indicate the potential advantages of the proposed algorithms over the traditional k-means and N-cut clustering algorithms for image segmentation in term of segmentation accuracy.
引用
收藏
页码:83 / 96
页数:14
相关论文
共 38 条
[1]   Newton's method on Riemannian manifolds and a geometric model for the human spine [J].
Adler, RL ;
Dedieu, JP ;
Margulies, JY ;
Martens, M ;
Shub, M .
IMA JOURNAL OF NUMERICAL ANALYSIS, 2002, 22 (03) :359-390
[2]  
[Anonymous], 2004, ICML
[3]  
[Anonymous], 2005, Proceedings of the 22nd International Conference on Machine Learning
[4]  
[Anonymous], 2006, BOOK REV IEEE T NEUR
[5]  
[Anonymous], 1996, MATRIX COMPUTATION
[6]  
[Anonymous], 2003, Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence
[7]  
[Anonymous], 2004, P 10 ACM SIGKDD INT, DOI DOI 10.1145/1014052.1014062
[8]  
[Anonymous], 2007, Uci machine learning repository
[9]  
[Anonymous], 2006, Advances in Neural Information Processing Systems
[10]  
[Anonymous], 2008, P IEEE COMP SOC C CO