Measuring the Objectness of Image Windows

被引:853
作者
Alexe, Bogdan [1 ]
Deselaers, Thomas [2 ]
Ferrari, Vittorio [3 ]
机构
[1] ETH, Comp Vis Lab, CH-8092 Zurich, Switzerland
[2] Google Switzerland, CH-8002 Zurich, Switzerland
[3] Univ Edinburgh, IPAB Inst, Edinburgh, Midlothian, Scotland
关键词
Objectness measure; object detection; object recognition;
D O I
10.1109/TPAMI.2012.28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a generic objectness measure, quantifying how likely it is for an image window to contain an object of any class. We explicitly train it to distinguish objects with a well-defined boundary in space, such as cows and telephones, from amorphous background elements, such as grass and road. The measure combines in a Bayesian framework several image cues measuring characteristics of objects, such as appearing different from their surroundings and having a closed boundary. These include an innovative cue to measure the closed boundary characteristic. In experiments on the challenging PASCAL VOC 07 dataset, we show this new cue to outperform a state-of-the-art saliency measure, and the combined objectness measure to perform better than any cue alone. We also compare to interest point operators, a HOG detector, and three recent works aiming at automatic object segmentation. Finally, we present two applications of objectness. In the first, we sample a small numberof windows according to their objectness probability and give an algorithm to employ them as location priors for modern class-specific object detectors. As we show experimentally, this greatly reduces the number of windows evaluated by the expensive class-specific model. In the second application, we use objectness as a complementary score in addition to the class-specific model, which leads to fewer false positives. As shown in several recent papers, objectness can act as a valuable focus of attention mechanism in many other applications operating on image windows, including weakly supervised learning of object categories, unsupervised pixelwise segmentation, and object tracking in video. Computing objectness is very efficient and takes only about 4 sec. per image.
引用
收藏
页码:2189 / 2202
页数:14
相关论文
共 52 条
  • [1] Alexe B., 2010, P 11 EUR C COMP VIS
  • [2] [Anonymous], P 11 EUR C COMP VIS
  • [3] [Anonymous], 2010, P IEEE C COMP VIS PA
  • [4] [Anonymous], 2010, P IEEE C COMP VIS PA
  • [5] [Anonymous], 2008, P IEEE C COMP VIS PA
  • [6] [Anonymous], 2009, P 12 IEEE INT C COMP
  • [7] [Anonymous], P 18 INT C PATT REC
  • [8] [Anonymous], 2009, P IEEE C COMP VIS PA
  • [9] [Anonymous], 2008, P IEEE C COMP VIS PA
  • [10] [Anonymous], P IEEE C COMP VIS PA