Learning to detect objects in images via a sparse, part-based representation

被引:484
作者
Agarwal, S [1 ]
Awan, A [1 ]
Roth, D [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
object detection; image representation; machine learning; evaluation/methodology;
D O I
10.1109/TPAMI.2004.108
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of detecting objects in still, gray-scale images. Our primary focus is the development of a learning-based approach to the problem that makes use of a sparse, part-based representation. A vocabulary of distinctive object parts is automatically constructed from a set of sample images of the object class of interest; images are then represented using parts from this vocabulary, together with spatial relations observed among the parts. Based on this representation, a learning algorithm is used to automatically learn to detect instances of the object class in new images. The approach can be applied to any object with distinguishable parts in a relatively fixed spatial configuration; it is evaluated here on difficult sets of real-world images containing side views of cars, and is seen to successfully detect objects in varying conditions amidst background clutter and mild occlusion. In evaluating object detection approaches, several important methodological issues arise that have not been satisfactorily addressed in previous work. A secondary focus of this paper is to highlight these issues and to develop rigorous evaluation standards for the object detection problem. A critical evaluation of our approach under the proposed standards is presented.
引用
收藏
页码:1475 / 1490
页数:16
相关论文
共 29 条
  • [1] Agarwal S, 2002, LECT NOTES COMPUT SC, V2353, P113
  • [2] A computational model for visual selection
    Amit, Y
    Geman, D
    [J]. NEURAL COMPUTATION, 1999, 11 (07) : 1691 - 1715
  • [3] [Anonymous], 1999, Feature Grouping
  • [4] [Anonymous], 1996, HIGH LEVEL VISION OB
  • [5] RECOGNITION-BY-COMPONENTS - A THEORY OF HUMAN IMAGE UNDERSTANDING
    BIEDERMAN, I
    [J]. PSYCHOLOGICAL REVIEW, 1987, 94 (02) : 115 - 147
  • [6] CARLSON AJ, 1999, UIUCDCSR992101 COMP
  • [7] Face detection with information-based maximum discrimination
    Colmenarez, AJ
    Huang, TS
    [J]. 1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, : 782 - 787
  • [8] Haralick R. M., 1993, COMPUTER ROBOT VISIO, V2
  • [9] Littlestone N., 1988, Machine Learning, V2, P285, DOI 10.1007/BF00116827
  • [10] Visual object recognition
    Logothetis, NK
    Sheinberg, DL
    [J]. ANNUAL REVIEW OF NEUROSCIENCE, 1996, 19 : 577 - 621