VIEW-DEPENDENT OBJECT RECOGNITION BY MONKEYS

被引:218
作者
LOGOTHETIS, NK
PAULS, J
BULTHOFF, HH
POGGIO, T
机构
[1] MAX PLANCK INST BIOL CYBERNET, D-72076 TUBINGEN, GERMANY
[2] MIT, CTR COMPUTAT & BIOL LEARNING, CAMBRIDGE, MA 02139 USA
[3] MIT, DEPT BRAIN SCI, CAMBRIDGE, MA 02139 USA
关键词
D O I
10.1016/S0960-9822(00)00089-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background: How do we recognize visually perceived three-dimensional objects, particularly when they are seen from novel view-points? Recent psychophysical studies have suggested that the human visual system may store a relatively small number of two-dimensional views of a three-dimensional object, recognizing novel views of the object by interpolation between the stored sample views. In order to investigate the neural mechanisms underlying this process, physiological experiments are required and, as a prelude to such experiments, we have been interested to know whether the observations made with human observers extend to monkeys. Results: We trained monkeys to recognize computer-generated images of objects presented from an arbitrarily chosen training view and containing sufficient three-dimensional information to specify the object's structure. We subsequently tested the trained monkeys' ability to generalize recognition of the object to views generated by rotation of the target object around any arbitrary axis. The monkeys recognized as the target only those two-dimensional views that were close to the familiar, training view. Recognition became increasingly difficult for the monkeys as the stimulus was rotated away from the experienced viewpoint, and failed for views farther than about 40 degrees from the training view. This suggests that, in the early stages of learning to recognize a previously unfamiliar object, the monkeys build two-dimensional, viewer-centered object representations, rather than a three-dimensional model of the object. When the animals were trained with as few as three views of the object, 120 degrees apart, they could often recognize all the views of the object resulting from rotations around the same axis. Conclusion: Our experiments show that recognition of three-dimensional novel objects is a function of the object's retinal projection. This suggests that nonhuman primates, like humans, may accomplish view-invariant recognition of familiar objects by a viewer-centered system that interpolates between a small number of stored views. The measures of recognition performance can be simulated by a regularization network that stores a few familiar views, and is endowed with the ability to interpolate between these views. Our results provide the basis for physiological studies of object-recognition by monkeys and suggest that the insights gained from such studies should apply also to humans.
引用
收藏
页码:401 / 414
页数:14
相关论文
共 42 条