3DMV: Joint 3D-Multi-view Prediction for 3D Semantic Scene Segmentation

被引：245

作者：

Dai, Angela ^{[1
]}

Niessner, Matthias ^{[2
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

[2] Tech Univ Munich, Munich, Germany

来源：

COMPUTER VISION - ECCV 2018, PT X | 2018年 / 11214卷

关键词：

D O I：

10.1007/978-3-030-01249-6_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present 3DMV, a novel method for 3D semantic scene segmentation of RGB-D scans in indoor environments using a joint 3D-multi-view prediction network. In contrast to existing methods that either use geometry or RGB data as input for this task, we combine both data modalities in a joint, end-to-end network architecture. Rather than simply projecting color data into a volumetric grid and operating solely in 3D - which would result in insufficient detail - we first extract feature maps from associated RGB images. These features are then mapped into the volumetric feature grid of a 3D network using a differentiable backprojection layer. Since our target is 3D scanning scenarios with possibly many frames, we use a multi-view pooling approach in order to handle a varying number of RGB input views. This learned combination of RGB and geometric features with our joint 2D-3D architecture achieves significantly better results than existing baselines. For instance, our final result on the ScanNet 3D segmentation benchmark increases from 52.8% to 75% accuracy compared to existing volumetric architectures.

引用

页码：458 / 474

页数：17

共 43 条

[1]

[Anonymous], 2015, PROC CVPR IEEE, DOI 10.1109/CVPR.2015.7298801

[2]

[Anonymous], 2013, ACM Transactions on Graphics (ToG), DOI DOI 10.1145/2508363.2508374

[3] Hierarchical Surface Prediction for 3D Object Reconstruction [J].

Bane, Christian ;

Tulsiani, Shubham ;

Malik, Jitendra .

PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2017, :412-420

[4]

Chang A., 2017, INT C 3D BIS 3DV

[5]

Chang A. X., 2015, ARXIV

[6] 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction [J].

Choy, Christopher B. ;

Xu, Danfei ;

Gwak, Jun Young ;

Chen, Kevin ;

Savarese, Silvio .

COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 :628-644

[7]

Curless B., 1996, Computer Graphics Proceedings. SIGGRAPH '96, P303, DOI 10.1145/237170.237269

[8]

Dai A., 2017, TOG, V36, P24

[9] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [J].

Dai, Angela ;

Ritchie, Daniel ;

Bokeloh, Martin ;

Reed, Scott ;

Sturm, Juergen ;

Niessner, Matthias .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4578-4587

[10] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

← 1 2 3 4 5 →