Contextually guided semantic labeling and search for three-dimensional point clouds

被引:112
作者
Anand, Abhishek [1 ]
Koppula, Hema Swetha [1 ]
Joachims, Thorsten [1 ]
Saxena, Ashutosh [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
3D perception; object recognition; structure learning; parsimonious model; 3D context; object search; OBJECT RECOGNITION; CLASSIFICATION; FRAMEWORK; SELECTION;
D O I
10.1177/0278364912461538
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
RGB-D cameras, which give an RGB image together with depths, are becoming increasingly popular for robotic perception. In this paper, we address the task of detecting commonly found objects in the three-dimensional (3D) point cloud of indoor scenes obtained from such cameras. Our method uses a graphical model that captures various features and contextual relations, including the local visual appearance and shape cues, object co-occurrence relationships and geometric relationships. With a large number of object classes and relations, the model's parsimony becomes important and we address that by using multiple types of edge potentials. We train the model using a maximum-margin learning approach. In our experiments concerning a total of 52 3D scenes of homes and offices (composed from about 550 views), we get a performance of 84.06% and 73.38% in labeling office and home scenes respectively for 17 object classes each. We also present a method for a robot to search for an object using the learned model and the contextual information available from the current labelings of the scene. We applied this algorithm successfully on a mobile robot for the task of finding 12 object classes in 10 different offices and achieved a precision of 97.56% with 78.43% recall.(1)
引用
收藏
页码:19 / 34
页数:16
相关论文
共 57 条
[1]  
Anguelov D., 2005, Computer Vision and Pattern Recognition
[2]  
[Anonymous], 2003, NEURAL INFORM PROCES
[3]  
[Anonymous], BRIT MACH VIS C BMVC
[4]  
[Anonymous], 2004, WORKSH STAT LEARN CO
[5]  
[Anonymous], COMPUTER VISION PATT
[6]  
[Anonymous], 2011, NEURAL INFORM PROCES
[7]  
[Anonymous], 2005, PUTER VISION IMAGE U, DOI DOI 10.1016/J.CVIU.2007.09.014
[8]  
[Anonymous], 2004, INT C MACH LEARN ICM
[9]  
[Anonymous], IEEE INT C ROB AUT K
[10]  
[Anonymous], COMPUTER VISION PATT