Contextually guided semantic labeling and search for three-dimensional point clouds

被引:112
作者
Anand, Abhishek [1 ]
Koppula, Hema Swetha [1 ]
Joachims, Thorsten [1 ]
Saxena, Ashutosh [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
3D perception; object recognition; structure learning; parsimonious model; 3D context; object search; OBJECT RECOGNITION; CLASSIFICATION; FRAMEWORK; SELECTION;
D O I
10.1177/0278364912461538
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
RGB-D cameras, which give an RGB image together with depths, are becoming increasingly popular for robotic perception. In this paper, we address the task of detecting commonly found objects in the three-dimensional (3D) point cloud of indoor scenes obtained from such cameras. Our method uses a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurrence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. We train the model using a maximum-margin learning approach. In our experiments concerning a total of 52 3D scenes of homes and offices (composed from about 550 views), we get a performance of 84.06% and 73.38% in labeling office and home scenes respectively for 17 object classes each. We also present a method for a robot to search for an object using the learned model and the contextual information available from the current labelings of the scene. We applied this algorithm successfully on a mobile robot for the task of finding 12 object classes in 10 different offices and achieved a precision of 97.56% with 78.43% recall.(1)
引用
收藏
页码:19 / 34
页数:16
相关论文
共 57 条
[41]  
Ly D, 2012, GEN EV COMP C GECCO
[42]   A probabilistic framework for object search with 6-DOF pose estimation [J].
Ma, Jeremy ;
Chung, Timothy H. ;
Burdick, Joel .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2011, 30 (10) :1209-1228
[43]  
Meger D, 2010, IEEE INT C ROB AUT A
[44]  
Munoz D, 2009, IEEE INT C ROB AUT K
[45]  
Murphy K., 2003, Neural Information Processing Systems
[46]   A framework for visual-context-aware object detection in still images [J].
Perko, Roland ;
Leonardis, Ales .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2010, 114 (06) :700-711
[47]   Towards 3D Point cloud based object maps for household environments [J].
Rusu, Radu Bogdan ;
Marton, Zoltan Csaba ;
Blodow, Nico ;
Dolha, Mihai ;
Beetz, Michael .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2008, 56 (11) :927-941
[48]   3-d depth reconstruction from a single still image [J].
Saxena, Ashutosh ;
Chung, Sung H. ;
Ng, Andrew Y. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2008, 76 (01) :53-69
[49]   Make3D: Learning 3D Scene Structure from a Single Still Image [J].
Saxena, Ashutosh ;
Sun, Min ;
Ng, Andrew Y. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (05) :824-840
[50]  
Shapovalov R, 2010, ISPRS COMM 3 S PCV 2