Learning Depth with Convolutional Spatial Propagation Network

被引:202
作者
Cheng, Xinjing [1 ]
Wang, Peng [1 ]
Yang, Ruigang [1 ]
机构
[1] Baidu Inc, Baidu Res, Beijing 100085, Peoples R China
关键词
Estimation; Task analysis; Three-dimensional displays; Cameras; Laser radar; Convolutional codes; Benchmark testing; Spatial propagation networks; depth completion; stereo matching; spatial pyramid pooling; STEREO; GEOMETRY;
D O I
10.1109/TPAMI.2019.2947374
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
In this paper, we propose the convolutional spatial propagation network (CSPN) and demonstrate its effectiveness for various depth estimation tasks. CSPN is a simple and efficient linear propagation model, where the propagation is performed with a manner of recurrent convolutional operations, in which the affinity among neighboring pixels is learned through a deep convolutional neural network (CNN). Compare to the previous state-of-the-art (SOTA) linear propagation model, i.e., spatial propagation networks (SPN), CSPN is 2 to 5x faster in practice. We concatenate CSPN and its variants to SOTA depth estimation networks, which significantly improve the depth accuracy. Specifically, we apply CSPN to two depth estimation problems: depth completion and stereo matching, in which we design modules which adapts the original 2D CSPN to embed sparse depth samples during the propagation, operate with 3D convolution and be synergistic with spatial pyramid pooling. In our experiments, we show that all these modules contribute to the final performance. For the task of depth completion, our method reduce the depth error over 30 percent in the NYU v2 and KITTI datasets. For the task of stereo matching, our method currently ranks 1st on both the KITTI Stereo 2012 and 2015 benchmarks.
引用
收藏
页码:2361 / 2379
页数:19
相关论文
共 101 条
[1]
[Anonymous], NIPS 2011
[2]
[Anonymous], 2015, COMPUTER SCI
[3]
[Anonymous], 1998, ANISOTROPIC DIFFUSIO
[4]
[Anonymous], 2017, PATTERN RECOGNITION
[5]
The Fast Bilateral Solver [J].
Barron, Jonathan T. ;
Poole, Ben .
COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 :617-632
[6]
Bascle B., 1993, [1993] Proceedings Fourth International Conference on Computer Vision, P421, DOI 10.1109/ICCV.1993.378185
[7]
Convolutional Random Walk Networks for Semantic Image Segmentation [J].
Bertasius, Gedas ;
Torresani, Lorenzo ;
Yu, Stella X. ;
Shi, Jianbo .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6137-6145
[8]
Biswas Joydeep., 2011, RGB D WORKSH RSS, V2011, P21
[9]
Matterport3D: Learning from RGB-D Data in Indoor Environments [J].
Chang, Angel ;
Dai, Angela ;
Funkhouser, Thomas ;
Halber, Maciej ;
Niessner, Matthias ;
Savva, Manolis ;
Song, Shuran ;
Zeng, Andy ;
Zhang, Yinda .
PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2017, :667-676
[10]
Pyramid Stereo Matching Network [J].
Chang, Jia-Ren ;
Chen, Yong-Sheng .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5410-5418