Subset based deep learning for RGB-D object recognition

被引:47
作者
Bai, Jing [1 ]
Wu, Yan [1 ]
Zhang, Junming [1 ]
Chen, Fuqiang [1 ]
机构
[1] Tongji Univ, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
关键词
RGB-D object recognition; Subset based feature extracting; Sparse auto-encoder; Recursive neural networks; Deep learning; AUTOENCODER;
D O I
10.1016/j.neucom.2015.03.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-D camera can easily record both color and depth images and previous works have proved that combining them together could dramatically improve the RGB-D based object recognition accuracy. In this paper, a new method based on a subset approach was introduced to learn higher level features from the raw data. The raw RGB and depth images were divided into several subsets according to their shapes and colors, guaranteeing that any two different objects in each subset are nearly not similar. Then a RGB-Subset-Sparse auto-encoder was trained to extract features from RGB images and a Depth-Subset-Sparse auto-encoder was trained to extract features from depth images for each subset. Then the learned features were transmitted to recursive neural networks (RNNs) to reduce the dimensionality of the features and learn robust hierarchical feature representations. The feature representations learned from RGB images and depth images were concatenated as the final features and then sent to a softmax classifier for classification. The proposed method is evaluated on three benchmark RGB-D datasets, RGB-D dataset of Lai et al., 2D3D dataset of Browatzki et al. and Aharon dataset of Aharon et al. Compared with other methods, ours achieves state-of-the-art performance on the first two datasets. Furthermore, to validate the generalization of our subset approach, we also do some extra experiments of applying the subsets approach to several previous works, these accuracies improved significantly. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:280 / 292
页数:13
相关论文
共 23 条
  • [1] [Anonymous], 2012, P 26 ANN C NEUR PROC, DOI DOI 10.1002/2014GB005021
  • [2] [Anonymous], 2011, P 28 INT C MACHINE L
  • [3] [Anonymous], 2011, Advances in Neural Information Processing Systems
  • [4] Baccouche M., 2005, NETWORKS, V18, P602
  • [5] BARHILLEL A, 2011, P 2011 IEEE INT C CO, P65
  • [6] Blum M, 2012, IEEE INT CONF ROBOT, P1298, DOI 10.1109/ICRA.2012.6225188
  • [7] Bo L., 2013, EXPT ROBOTICS, P387, DOI DOI 10.1007/978-3-319-00065-7
  • [8] Bo LF, 2011, IEEE INT C INT ROBOT, P821, DOI 10.1109/IROS.2011.6048717
  • [9] Bradley D.M., 2008, Advances Neural Inform. Process. Syst, P113
  • [10] Browatzki B., 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), P1189, DOI 10.1109/ICCVW.2011.6130385