IntentSearch: Capturing User Intention for One-Click Internet Image Search

被引：47

作者：

Tang, Xiaoou ^{[1
]}

Liu, Ke ^{[1
]}

Cui, Jingyu ^{[2
]}

Wen, Fang ^{[3
]}

Wang, Xiaogang ^{[4
]}

机构：

[1] Chinese Univ Hong Kong, Dept Informat Engn, Shatin, Hong Kong, Peoples R China

[2] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA

[3] Microsoft Res Asia, Beijing 100080, Peoples R China

[4] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2012年 / 34卷 / 07期

关键词：

Image search; intention; image reranking; adaptive similarity; keyword expansion; RELEVANCE FEEDBACK; RETRIEVAL;

D O I：

10.1109/TPAMI.2011.242

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Web-scale image search engines (e. g., Google image search, Bing image search) mostly rely on surrounding text features. It is difficult for them to interpret users' search intention only by query keywords and this leads to ambiguous and noisy search results which are far from satisfactory. It is important to use visual information in order to solve the ambiguity in text-based image retrieval. In this paper, we propose a novel Internet image search approach. It only requires the user to click on one query image with minimum effort and images from a pool retrieved by text-based search are reranked based on both visual and textual content. Our key contribution is to capture the users' search intention from this one-click query image in four steps. 1) The query image is categorized into one of the predefined adaptive weight categories which reflect users' search intention at a coarse level. Inside each category, a specific weight schema is used to combine visual features adaptive to this kind of image to better rerank the text-based search result. 2) Based on the visual content of the query image selected by the user and through image clustering, query keywords are expanded to capture user intention. 3) Expanded keywords are used to enlarge the image pool to contain more relevant images. 4) Expanded keywords are also used to expand the query image to multiple positive visual examples from which new query specific visual and textual similarity metrics are learned to further improve content-based image reranking. All these steps are automatic, without extra effort from the user. This is critically important for any commercial web-based image search engine, where the user interface has to be extremely simple. Besides this key contribution, a set of visual features which are both effective and efficient in Internet image search are designed. Experimental evaluation shows that our approach significantly improves the precision of top-ranked images and also the user experience.

引用

页码：1342 / 1353

页数：12

共 56 条

[1] Crossing textual and visual content in different application scenarios [J].

Ah-Pine, Julien ;

Bressan, Marco ;

Clinchant, Stephane ;

Csurka, Gabriela ;

Hoppenot, Yves ;

Renders, Jean-Michel .

MULTIMEDIA TOOLS AND APPLICATIONS, 2009, 42 (01) :31-56

[2]

[Anonymous], 2007, P IEEE INT C COMP VI

[3]

[Anonymous], P INT C COMP VIS

[4]

[Anonymous], P INT C COMP VIS

[5]

[Anonymous], 2012, BING IM SEARCH

[6]

[Anonymous], P INT WORKSH AUT FAC

[7]

[Anonymous], 2005, P IEEE INT C COMP VI

[8]

[Anonymous], P IEEE INT C COMP VI

[9]

[Anonymous], P EUR C COMP VIS

[10]

Baeza-Yates R, 1999, MODERN INFORM RETRIE, V463

← 1 2 3 4 5 6 →