Fast, Accurate Detection of 100,000 Object Classes on a Single Machine

被引:140
作者
Dean, Thomas [1 ]
Ruzon, Mark A. [1 ]
Segal, Mark [1 ]
Shlens, Jonathon [1 ]
Vijayanarasimhan, Sudheendra [1 ]
Yagnik, Jay [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
来源
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2013年
关键词
D O I
10.1109/CVPR.2013.237
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many object detection systems are constrained by the time required to convolve a target image with a bank of filters that code for different aspects of an object's appearance, such as the presence of component parts. We exploit locality-sensitive hashing to replace the dot-product kernel operator in the convolution with a fixed number of hash-table probes that effectively sample all of the filter responses in time independent of the size of the filter bank. To show the effectiveness of the technique, we apply it to evaluate 100,000 deformable-part models requiring over a million (part) filters on multiple scales of a target image in less than 20 seconds using a single multi-core processor with 20GB of RAM. This represents a speed-up of approximately 20,000 times- four orders of magnitude-when compared with performing the convolutions explicitly on the same hardware. While mean average precision over the full set of 100,000 object classes is around 0.16 due in large part to the challenges in gathering training data and collecting ground truth for so many classes, we achieve a mAP of at least 0.20 on a third of the classes and 0.30 or better on about 20% of the classes.
引用
收藏
页码:1814 / 1821
页数:8
相关论文
共 21 条
  • [1] [Anonymous], 2010, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, DOI DOI 10.1109/CVPR.2010.5539906
  • [2] [Anonymous], 2010, IEEE C COMP VIS PATT, P73
  • [3] Biederman I., 1988, COMPUTATIONAL PROCES, P370
  • [4] Bollacker K., 2008, P 2008 ACM SIGMOD IN, P1247, DOI DOI 10.1145/1376616.1376746
  • [5] Histograms of oriented gradients for human detection
    Dalal, N
    Triggs, B
    [J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
  • [6] Everingham M., The PASCAL Visual Object Classes challenge 2010 (VOC2010) Development Kit
  • [7] Felzenszwalb P, 2008, PROC CVPR IEEE, P1984
  • [8] Gu CH, 2012, LECT NOTES COMPUT SC, V7575, P445, DOI 10.1007/978-3-642-33765-9_32
  • [9] Indyk P., 1998, Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, P604, DOI 10.1145/276698.276876
  • [10] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90