Building the gist of a scene: the role of global image features in recognition

被引:942
作者
Oliva, Aude
Torralba, Antonio
机构
[1] MIT, Dept Brain & Cognit Sci, Cambridge, MA 02139 USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
来源
VISUAL PERCEPTION, PT 2: FUNDAMENTALS OF AWARENESS: MULTI-SENSORY INTEGRATION AND HIGH-ORDER PERCEPTION | 2006年 / 155卷
关键词
scene recognition; gist; spatial envelope; global image feature; spatial frequency; natural image;
D O I
10.1016/S0079-6123(06)55002-2
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Humans can recognize the gist of a novel image in a single glance, independent of its complexity. How is this remarkable feat accomplished? On the basis of behavioral and computational evidence, this paper describes a formal approach to the representation and the mechanism of scene gist understanding, based on scene-centered, rather than object-centered primitives. We show that the structure of a scene image can be estimated by the mean of global image features, providing a statistical summary of the spatial layout properties (Spatial Envelope representation) of the scene. Global features are based on configurations of spatial scales and are estimated without invoking segmentation or grouping operations. The scene-centered approach is not an alternative to local image analysis but would serve as a feed-forward and parallel pathway of visual processing, able to quickly constrain local feature analysis and enhance object recognition in cluttered natural scenes.
引用
收藏
页码:23 / 36
页数:14
相关论文
共 72 条
[31]   Modeling the shape of the scene: A holistic representation of the spatial envelope [J].
Oliva, A ;
Torralba, A .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2001, 42 (03) :145-175
[32]   Diagnostic colors mediate scene recognition [J].
Oliva, A ;
Schyns, PG .
COGNITIVE PSYCHOLOGY, 2000, 41 (02) :176-210
[33]   Coarse blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli [J].
Oliva, A ;
Schyns, PG .
COGNITIVE PSYCHOLOGY, 1997, 34 (01) :72-107
[34]   Top-down control of visual attention in object detection. [J].
Oliva, A ;
Torralba, A ;
Castelhano, MS ;
Henderson, JM .
2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 1, PROCEEDINGS, 2003, :253-256
[35]  
Oliva A., 2004, P 26 ANN M COGN SCI
[36]  
OLIVA A, 2002, P 2 WORKSH BIOL MOT
[37]  
Oliva Aude, 2005, P251, DOI 10.1016/B978-012375731-9/50045-8
[38]   Emergence of simple-cell receptive field properties by learning a sparse code for natural images [J].
Olshausen, BA ;
Field, DJ .
NATURE, 1996, 381 (6583) :607-609
[39]   TEMPORAL INTEGRATION OF SPATIALLY FILTERED VISUAL IMAGES [J].
PARKER, DM ;
LISHMAN, JR ;
HUGHES, J .
PERCEPTION, 1992, 21 (02) :147-160
[40]   Role of coarse and fine spatial information in face and object processing [J].
Parker, DM ;
Lishman, JR ;
Hughes, J .
JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1996, 22 (06) :1448-1466