Efficient distance metric learning by adaptive sampling and mini-batch stochastic gradient descent (SGD)

被引:88
作者
Qian, Qi [1 ]
Jin, Rong [1 ]
Yi, Jinfeng [1 ]
Zhang, Lijun [1 ]
Zhu, Shenghuo [2 ]
机构
[1] Michigan State Univ, Dept Comp Sci & Engn, E Lansing, MI 48824 USA
[2] NEC Labs Amer, Cupertino, CA 95014 USA
基金
美国国家科学基金会;
关键词
CLASSIFICATION;
D O I
10.1007/s10994-014-5456-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distance metric learning (DML) is an important task that has found applications in many domains. The high computational cost of DML arises from the large number of variables to be determined and the constraint that a distance metric has to be a positive semi-definite (PSD) matrix. Although stochastic gradient descent (SGD) has been successfully applied to improve the efficiency of DML, it can still be computationally expensive in order to ensure that the solution is a PSD matrix. It has to, at every iteration, project the updated distance metric onto the PSD cone, an expensive operation. We address this challenge by developing two strategies within SGD, i.e. mini-batch and adaptive sampling, to effectively reduce the number of updates (i.e. projections onto the PSD cone) in SGD. We also develop hybrid approaches that combine the strength of adaptive sampling with that of mini-batch online learning techniques to further improve the computational efficiency of SGD for DML. We prove the theoretical guarantees for both adaptive sampling and mini-batch based approaches for DML. We also conduct an extensive empirical study to verify the effectiveness of the proposed algorithms for DML.
引用
收藏
页码:353 / 372
页数:20
相关论文
共 26 条
[1]  
[Anonymous], 2010, UCI Machine Learning Repository
[2]  
[Anonymous], 2009, P 26 ANN INT C MACHI
[3]  
Bekkerman Ron., 2008, Proceedings of the 17th ACM conference on Information and knowledge management, CIKM 2008, P1083
[4]  
Boyd Stephen, 2004, Convex optimization, DOI 10.1017/CBO9780511804441
[5]  
Cesa-Bianchi N., 2006, Prediction, learning, and games, DOI [DOI 10.1017/CBO9780511546921, 10.1017/CBO9780511546921]
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]  
Chang H., 2004, P INT C MACHINE LEAR, P153, DOI DOI 10.1145/1015330.1015391
[8]  
Chechik G, 2010, J MACH LEARN RES, V11, P1109
[9]  
Cotter A., 2011, Advances in Neural Information Processing Systems, V24
[10]  
Davis J. V., 2008, P 14 ACM SIGKDD INT, P195, DOI [DOI 10.1109/TGRS.2014.2303895, DOI 10.1145/1401890.1401918]