Total recall: Automatic query expansion with a generative feature model for object retrieval

被引：1509

作者：

Chum, Ondrej ^{[1
]}

Philbin, James ^{[1
]}

Sivic, Josef ^{[1
]}

Isard, Michael ^{[2
]}

Zisserman, Andrew ^{[1
]}

机构：

[1] Univ Oxford, Dept Engn Sci, Visual Geometry Grp, Oxford OX1 2JD, England

[2] Microsoft Res, Mountain View, CA USA

来源：

2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6 | 2007年

基金：

英国工程与自然科学研究理事会;

关键词：

D O I：

10.1109/cvpr.2007.383172

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Given a query image of an object, our objective is to retrieve all instances of that object in a large (1M+) image database. We adopt the bag-of-visual-words architecture which has proven successful in achieving high precision at low recall. Unfortunately, feature detection and quantization are noisy processes and this can result in variation in the particular visual words that appear in different images of the same object, leading to missed results. In the text retrieval literature a standard method for improving performance is query expansion. A number of the highly ranked documents from the original query are reissued as a new query. In this way, additional relevant terms can be added to the query. This is a form of blind relevance feedback and it can fail if 'outlier' (false positive) documents are included in the reissued query. In this paper we bring query expansion into the visual domain via two novel contributions. Firstly, strong spatial constraints between the query image and each result allow us to accurately verify each return, suppressing the false positives which typically ruin text-based query expansion. Secondly, the verified images can be used to learn a latent feature model to enable the controlled construction of expanded queries. We illustrate these ideas on the 5000 annotated image Oxford building database together with more than 1M Flickr images. We show that the precision is substantially boosted, achieving total recall in many cases.

引用

页码：496 / +

页数：2

共 20 条

[1]

[Anonymous], 2007, P CVPR

[2]

[Anonymous], 2003, P ICCV

[3]

Baeza-Yates R.A., 1999, Modern Information Retrieval

[4]

BUCKLEY C, 1995, TREC 3 P

[5]

CHURN O, 2003, DAGM

[6]

FERRARI V, 2004, P ECCV

[7]

Hartley Richard., 2017, Multiple View Geometry in Computer Vision

[8] Distinctive image features from scale-invariant keypoints [J].

Lowe, DG .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) :91-110

[9]

Lowe DG, 2001, PROC CVPR IEEE, P682

[10]

MIKOLAJCZYK K, 2006, IJCV, V1, P63

← 1 2 →