Cascade Residual Learning: A Two-stage Convolutional Neural Network for Stereo Matching

被引:313
作者
Pang, Jiahao [1 ]
Sun, Wenxiu [1 ]
Ren, Jimmy S. J. [1 ]
Yang, Chengxi [1 ]
Yan, Qiong [1 ]
机构
[1] SenseTime Grp Ltd, Hong Kong, Hong Kong, Peoples R China
来源
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017) | 2017年
关键词
D O I
10.1109/ICCVW.2017.108
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
Leveraging on the recent developments in convolutional neural networks (CNNs), matching dense correspondence from a stereo pair has been cast as a learning problem, with performance exceeding traditional approaches. However, it remains challenging to generate high-quality disparities for the inherently ill-posed regions. To tackle this problem, we propose a novel cascade CNN architecture composing of two stages. The first stage advances the recently proposed DispNet by equipping it with extra up-convolution modules, leading to disparity images with more details. The second stage explicitly rectifies the disparity initialized by the first stage; it couples with the first-stage and generates residual signals across multiple scales. The summation of the outputs from the two stages gives the final disparity. As opposed to directly learning the disparity at the second stage, we show that residual learning provides more effective refinement. Moreover, it also benefits the training of the overall cascade network. Experimentation shows that our cascade residual learning scheme provides state-of-the-art performance for matching stereo correspondence. By the time of the submission of this paper, our method ranks first in the KITTI 2015 stereo benchmark, surpassing the prior works by a noteworthy margin.
引用
收藏
页码:878 / 886
页数:9
相关论文
共 29 条
[1]
[Anonymous], 2016, NAT METHODS, DOI DOI 10.1038/nmeth.3707
[2]
[Anonymous], 2017, IEEE CVPR 2017
[3]
[Anonymous], 2017, CVPR
[4]
[Anonymous], 2015, P IEEE C COMP VIS PA
[5]
[Anonymous], 2016, J MACH LEARN RES
[6]
A pixel dissimilarity measure that is insensitive to image sampling [J].
Birchfield, S ;
Tomasi, C .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (04) :401-406
[7]
Bleyer M., 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3081, DOI 10.1109/CVPR.2011.5995581
[8]
Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation [J].
Brox, Thomas ;
Malik, Jitendra .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (03) :500-513
[9]
Human Pose Estimation with Iterative Error Feedback [J].
Carreira, Joao ;
Agrawal, Pulkit ;
Fragkiadaki, Katerina ;
Malik, Jitendra .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4733-4742
[10]
FlowNet: Learning Optical Flow with Convolutional Networks [J].
Dosovitskiy, Alexey ;
Fischer, Philipp ;
Ilg, Eddy ;
Haeusser, Philip ;
Hazirbas, Caner ;
Golkov, Vladimir ;
van der Smagt, Patrick ;
Cremers, Daniel ;
Brox, Thomas .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2758-2766