Location Discriminative Vocabulary Coding for Mobile Landmark Search

被引:115
作者
Ji, Rongrong [1 ,2 ]
Duan, Ling-Yu [1 ]
Chen, Jie [1 ]
Yao, Hongxun [2 ]
Yuan, Junsong [3 ]
Rui, Yong [4 ]
Gao, Wen [1 ]
机构
[1] Peking Univ, Inst Digital Media, Beijing 100871, Peoples R China
[2] Harbin Inst Technol, Visual Intelligence Lab, Harbin 150006, Peoples R China
[3] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore
[4] Microsoft China Res & Dev Grp, Beijing, Peoples R China
关键词
Mobile landmark search; Compact visual descriptor; Vocabulary compression; Two-way coding; Descriptor adaption; System applications; IMAGE; COMPRESSION;
D O I
10.1007/s11263-011-0472-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the popularization of mobile devices, recent years have witnessed an emerging potential for mobile landmark search. In this scenario, the user experience heavily depends on the efficiency of query transmission over a wireless link. As sending a query photo is time consuming, recent works have proposed to extract compact visual descriptors directly on the mobile end towards low bit rate transmission. Typically, these descriptors are extracted based solely on the visual content of a query, and the location cues from the mobile end are rarely exploited. In this paper, we present a Location Discriminative Vocabulary Coding (LDVC) scheme, which achieves extremely low bit rate query transmission, discriminative landmark description, as well as scalable descriptor delivery in a unified framework. Our first contribution is a compact and location discriminative visual landmark descriptor, which is offline learnt in two-step: First, we adopt spectral clustering to segment a city map into distinct geographical regions, where both visual and geographical similarities are fused to optimize the partition of city-scale geo-tagged photos. Second, we propose to learn LDVC in each region with two schemes: (1) a Ranking Sensitive PCA and (2) a Ranking Sensitive Vocabulary Boosting. Both schemes embed location cues to learn a compact descriptor, which minimizes the retrieval ranking loss by replacing the original high-dimensional signatures. Our second contribution is a location aware online vocabulary adaption: We store a single vocabulary in the mobile end, which is efficiently adapted for a region specific LDVC coding once a mobile device enters a given region. The learnt LDVC landmark descriptor is extremely compact (typically 10-50 bits with arithmetical coding) and performs superior over state-of-the-art descriptors. We implemented the framework in a real-world mobile landmark search prototype, which is validated in a million-scale landmark database covering typical areas e.g. Beijing, New York City, Lhasa, Singapore, and Florence.
引用
收藏
页码:290 / 314
页数:25
相关论文
共 47 条
[1]  
[Anonymous], 2009, P 18 INT C WORLD WID
[2]  
[Anonymous], 2006, 2006 IEEE COMP SOC C
[3]  
[Anonymous], 1988, Information processing management
[4]  
[Anonymous], INT J COMPUTER VISIO
[5]  
[Anonymous], ACM INT C MULT
[6]  
[Anonymous], ECCV
[7]  
[Anonymous], MOBILEMEDIA
[8]  
[Anonymous], CVPR
[9]  
[Anonymous], ICME
[10]  
Chandrasekhar Vijay, 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P2504, DOI 10.1109/CVPRW.2009.5206733