Recovering surface layout from an image

被引:423
作者
Hoiem, Derek [1 ]
Efros, Alexei A. [1 ]
Hebert, Martial [1 ]
机构
[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
surface layout; spatial layout; geometric context; scene understanding; context; object detection; model-driven segmentation; image understanding; multiple segmentations; object recognition;
D O I
10.1007/s11263-006-0031-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Humans have an amazing ability to instantly grasp the overall 3D structure of a scene-ground orientation, relative positions of major landmarks, etc.-even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this "surface layout" of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image into geometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.
引用
收藏
页码:151 / 172
页数:22
相关论文
共 56 条