A quantitative theory of immediate visual recognition

被引:138
作者
Serre, Thomas [1 ]
Kreiman, Gabriel [1 ]
Kouh, Minjoon [1 ]
Cadieu, Charles [1 ]
Knoblich, Ulf [1 ]
Poggio, Tomaso [1 ]
机构
[1] MIT, Dept Brain & Cognit Sci, Comp Sci & Artificial Intelligence Lab, Ctr Biol & Computat Learning,McGovern Inst Brain, Cambridge, MA 02139 USA
来源
COMPUTATIONAL NEUROSCIENCE: THEORETICAL INSIGHTS INTO BRAIN FUNCTION | 2007年 / 165卷
基金
美国国家科学基金会;
关键词
visual object recognition; hierarchical models; ventral stream; feedforward;
D O I
10.1016/S0079-6123(06)65004-8
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Human and non-human primates excel at visual recognition tasks. The primate visual system exhibits a strong degree of selectivity while at the same time being robust to changes in the input image. We have developed a quantitative theory to account for the computations performed by the feedforward path in the ventral stream of the primate visual cortex. Here we review recent predictions by a model instantiating the theory about physiological observations in higher visual areas. We also show that the model can perform recognition tasks on datasets of complex natural images at a level comparable to psychophysical measurements on human observers during rapid categorization tasks. In sum, the evidence suggests that the theory may provide a framework to explain the first 100-150 ms of visual object recognition. The model also constitutes a vivid example of how computational models can interact with experimental observations in order to advance our understanding of a complex phenomenon. We conclude by suggesting a number of open questions, predictions, and specific experiments for visual physiology and psychophysics.
引用
收藏
页码:33 / 56
页数:24
相关论文
共 78 条
[1]   Representational capacity of face coding in monkeys [J].
Abbott, LF ;
Rolls, ET ;
Tovee, MJ .
CEREBRAL CORTEX, 1996, 6 (03) :498-505
[2]   An integrated network for invariant visual detection and recognition [J].
Amit, Y ;
Mascaro, M .
VISION RESEARCH, 2003, 43 (19) :2073-2088
[3]  
[Anonymous], THESIS MIT CAMBRIDGE
[4]  
[Anonymous], P IEEE C COMP VIS PA
[5]   The time course of visual processing:: Backward masking and natural scene categorisation [J].
Bacon-Macé, N ;
Macé, MJM ;
Fabre-Thorpe, M ;
Thorpe, SJ .
VISION RESEARCH, 2005, 45 (11) :1459-1469
[6]  
Barlow H. B., 1961, Sensory_Communication, P217
[7]   RECOGNITION-BY-COMPONENTS - A THEORY OF HUMAN IMAGE UNDERSTANDING [J].
BIEDERMAN, I .
PSYCHOLOGICAL REVIEW, 1987, 94 (02) :115-147
[8]   View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex [J].
Booth, MCA ;
Rolls, ET .
CEREBRAL CORTEX, 1998, 8 (06) :510-523
[9]  
Breitmeyer B., 2006, VISUAL MASKING TIME
[10]   ORGANIZATION OF SUPPRESSION IN RECEPTIVE-FIELDS OF NEURONS IN CAT VISUAL-CORTEX [J].
DEANGELIS, GC ;
ROBSON, JG ;
OHZAWA, I ;
FREEMAN, RD .
JOURNAL OF NEUROPHYSIOLOGY, 1992, 68 (01) :144-163