Evaluating Color Descriptors for Object and Scene Recognition

被引:917
作者
van de Sande, Koen E. A. [1 ]
Gevers, Theo [1 ]
Snoek, Cees G. M. [1 ]
机构
[1] Univ Amsterdam, Inst Informat, NL-1098 XG Amsterdam, Netherlands
关键词
Image/video retrieval; evaluation/methodology; color; invariants; pattern recognition; CLASSIFICATION;
D O I
10.1109/TPAMI.2009.154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image category recognition is important to access visual information on the level of objects and scene types. So far, intensity-based descriptors have been widely used for feature extraction at salient points. To increase illumination invariance and discriminative power, color descriptors have been proposed. Because many different descriptors exist, a structured overview is required of color invariant descriptors in the context of image category recognition. Therefore, this paper studies the invariance properties and the distinctiveness of color descriptors (software to compute the color descriptors from this paper is available from http://www.colordescriptors.com) in a structured way. The analytical invariance properties of color descriptors are explored, using a taxonomy based on invariance properties with respect to photometric transformations, and tested experimentally using a data set with known illumination conditions. In addition, the distinctiveness of color descriptors is assessed experimentally using two benchmarks, one from the image domain and one from the video domain. From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition. The results further reveal that, for light intensity shifts, the usefulness of invariance is category-specific. Overall, when choosing a single descriptor and no prior knowledge about the data set and object and scene categories is available, the OpponentSIFT is recommended. Furthermore, a combined set of color descriptors outperforms intensity-based SIFT and improves category recognition by 8 percent on the PASCAL VOC 2007 and by 7 percent on the Mediamill Challenge.
引用
收藏
页码:1582 / 1596
页数:15
相关论文
共 42 条
  • [31] Video Google: A text retrieval approach to object matching in videos
    Sivic, J
    Zisserman, A
    [J]. NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, 2003, : 1470 - +
  • [32] Smeaton A.F., 2006, Proc. ACM Multimedia Information Retrieval, P321
  • [33] Snoek C. G. M., 2006, P 14 ACM INT C MULT, P421, DOI DOI 10.1145/1180639.1180727
  • [34] SNOEK CGM, 2008, P 6 TRECVID WORKSH N
  • [35] Tahir M. A., 2008, P PASCAL VIS OBJ CLA
  • [36] Local Invariant Feature Detectors: A Survey
    Tuytelaars, Tinne
    Mikolajczyk, Krystian
    [J]. FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2007, 3 (03): : 177 - 280
  • [37] Boosting color saliency in image feature detection
    van de Weijer, J
    Gevers, T
    Bagdanov, AD
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (01) : 150 - 156
  • [38] Visual Word Ambiguity
    van Gemert, Jan C.
    Veenman, Cor J.
    Smeulders, Arnold W. M.
    Geusebroek, Jan-Mark
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (07) : 1271 - 1283
  • [39] VANGEMERT JC, 2006, P IEEE CVPR WORKSH S
  • [40] Semantic modeling of natural scenes for content-based image retrieval
    Vogel, Julia
    Schiele, Bernt
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2007, 72 (02) : 133 - 157