Detection guided deconvolutional network for hierarchical feature learning

被引:15
作者
Liu, Jing [1 ]
Liu, Bingyuan [1 ]
Lu, Hanging [1 ]
机构
[1] CASIA, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Image representation; Deep leaning; Object recognition; OBJECT RECOGNITION; RECEPTIVE-FIELDS; ALGORITHM; MODELS; LEVEL;
D O I
10.1016/j.patcog.2015.02.002
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Deep learning models have gained significant interest as a way of building hierarchical image representation. However, current models still perform far behind human vision system because of the lack of selective property, the lack of high-level guidance for learning and the weakness to learn from few examples. To address these problems, we propose a detection-guided hierarchical learning algorithm for image representation. First, we train a multi-layer deconvolutional network in an unsupervised bottom-up scheme. During the training process, we use each raw image as an input, and decompose an image using multiple alternating layers of non-negative convolutional sparse coding and max-pooling. Inspired from the observation that the filters in top layer can be selectively activated by different high-level structures of images, i.e., one or partial filters should correspond to a particular object class, we update the filters in network by minimizing the reconstruction errors of the corresponding feature maps with respect to certain object detection maps obtained by a set of pre-trained detectors. With the fine-tuned network, we can extract the features of given images in a purely unsupervised way with no need of detectors. We evaluate the proposed feature representation on the task of object recognition, for which an SVM classifier with spatial pyramid matching kernel is used. Experiments on the datasets of PASCAL VOC 2007, Caltech-101 and Caltech-256 demonstrate that our approach outperforms some recent hierarchical feature descriptors as well as classical hand-crafted features. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2645 / 2655
页数:11
相关论文
共 38 条
[1]
Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks [J].
Ahmed, Amr ;
Yu, Kai ;
Xu, Wei ;
Gong, Yihong ;
Xing, Eric .
COMPUTER VISION - ECCV 2008, PT III, PROCEEDINGS, 2008, 5304 :69-+
[2]
[Anonymous], 2010, Advances in Neural Information Processing Systems (NIPS 2010)
[3]
[Anonymous], 2005, PROC IEEE COMPUT SOC
[4]
[Anonymous], INT C LEARN REPR ICL
[5]
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems [J].
Beck, Amir ;
Teboulle, Marc .
SIAM JOURNAL ON IMAGING SCIENCES, 2009, 2 (01) :183-202
[6]
BOUREAU YL, 2010, PROC CVPR IEEE, P2559, DOI DOI 10.1109/CVPR.2010.5539963
[7]
The devil is in the details: an evaluation of recent feature encoding methods [J].
Chatfield, Ken ;
Lempitsky, Victor ;
Vedaldi, Andrea ;
Zisserman, Andrew .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
[8]
Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[9]
DESIMONE R, 1984, J NEUROSCI, V4, P2051
[10]
Everingham M., 2007, The PASCAL visual object classes challenge 2007 (VOC2007) Results