Contextually guided semantic labeling and search for three-dimensional point clouds

被引：112

作者：

Anand, Abhishek ^{[1
]}

Koppula, Hema Swetha ^{[1
]}

Joachims, Thorsten ^{[1
]}

Saxena, Ashutosh ^{[1
]}

机构：

[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA

来源：

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH | 2013年 / 32卷 / 01期

关键词：

3D perception; object recognition; structure learning; parsimonious model; 3D context; object search; OBJECT RECOGNITION; CLASSIFICATION; FRAMEWORK; SELECTION;

D O I：

10.1177/0278364912461538

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

RGB-D cameras, which give an RGB image together with depths, are becoming increasingly popular for robotic perception. In this paper, we address the task of detecting commonly found objects in the three-dimensional (3D) point cloud of indoor scenes obtained from such cameras. Our method uses a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurrence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. We train the model using a maximum-margin learning approach. In our experiments concerning a total of 52 3D scenes of homes and offices (composed from about 550 views), we get a performance of 84.06% and 73.38% in labeling office and home scenes respectively for 17 object classes each. We also present a method for a robot to search for an object using the learned model and the contextual information available from the current labelings of the scene. We applied this algorithm successfully on a mobile robot for the task of finding 12 object classes in 10 different offices and achieved a precision of 97.56% with 78.43% recall.(1)

引用

页码：19 / 34

页数：16

共 57 条

[41]

Ly D, 2012, GEN EV COMP C GECCO

[42] A probabilistic framework for object search with 6-DOF pose estimation [J].