3DMV: Joint 3D-Multi-view Prediction for 3D Semantic Scene Segmentation

被引:245
作者
Dai, Angela [1 ]
Niessner, Matthias [2 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Tech Univ Munich, Munich, Germany
来源
COMPUTER VISION - ECCV 2018, PT X | 2018年 / 11214卷
关键词
D O I
10.1007/978-3-030-01249-6_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present 3DMV, a novel method for 3D semantic scene segmentation of RGB-D scans in indoor environments using a joint 3D-multi-view prediction network. In contrast to existing methods that either use geometry or RGB data as input for this task, we combine both data modalities in a joint, end-to-end network architecture. Rather than simply projecting color data into a volumetric grid and operating solely in 3D - which would result in insufficient detail - we first extract feature maps from associated RGB images. These features are then mapped into the volumetric feature grid of a 3D network using a differentiable backprojection layer. Since our target is 3D scanning scenarios with possibly many frames, we use a multi-view pooling approach in order to handle a varying number of RGB input views. This learned combination of RGB and geometric features with our joint 2D-3D architecture achieves significantly better results than existing baselines. For instance, our final result on the ScanNet 3D segmentation benchmark increases from 52.8% to 75% accuracy compared to existing volumetric architectures.
引用
收藏
页码:458 / 474
页数:17
相关论文
共 43 条
[1]  
[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7298801
[2]  
[Anonymous], 2013, ACM Transactions on Graphics (ToG), DOI DOI 10.1145/2508363.2508374
[3]   Hierarchical Surface Prediction for 3D Object Reconstruction [J].
Bane, Christian ;
Tulsiani, Shubham ;
Malik, Jitendra .
PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2017, :412-420
[4]  
Chang A., 2017, INT C 3D BIS 3DV
[5]  
Chang A. X., 2015, ARXIV
[6]   3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].
Choy, Christopher B. ;
Xu, Danfei ;
Gwak, Jun Young ;
Chen, Kevin ;
Savarese, Silvio .
COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644
[7]  
Curless B., 1996, Computer Graphics Proceedings. SIGGRAPH '96, P303, DOI 10.1145/237170.237269
[8]  
Dai A., 2017, TOG, V36, P24
[9]   ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [J].
Dai, Angela ;
Ritchie, Daniel ;
Bokeloh, Martin ;
Reed, Scott ;
Sturm, Juergen ;
Niessner, Matthias .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4578-4587
[10]   Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].
Dai, Angela ;
Qi, Charles Ruizhongtai ;
Niessner, Matthias .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554