The viewpoint complexity of an object-recognition task

被引:47
作者
Tjan, BS
Legge, GE
机构
[1] Max Planck Inst Biol Cybernet, D-72076 Tubingen, Germany
[2] Univ Minnesota, Dept Psychol, Minneapolis, MN 55455 USA
关键词
object recognition; perceptual representation; viewpoint effects; ideal observer;
D O I
10.1016/S0042-6989(97)00255-1
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
There is an ongoing debate about the nature of perceptual representation in human object recognition. Resolution of this debate has been hampered by the lack of a metric for assessing the representational requirements of a recognition task. To recognize a member of a given set of 3-D objects, how much detail must the objects' representations contain in order to achieve a specific accuracy criterion? From the performance of an ideal observer: we derived a quantity called the view complexity (VX) to measure the required granularity of representation. VX is an intrinsic property of the object-recognition task, taking into account both the object ensemble and the type of decision required of an observer. It does not depend on the visual representation or processing used by the observer. VX can be interpreted as the number of randomly selected 2-D images needed to represent the decision boundaries in the image space of a 3-D object-recognition task. A low VX means the task is inherently more viewpoint invariant and a high VX means it is inherently more viewpoint dependent. By measuring the VX of recognition tasks with different object sets, we show that the current confusion about the nature of human perceptual representation is partly due to a failure in distinguishing between human visual processing and the properties of a task and its stimuli. We find general correspondence between the VX of a recognition task and the published human data on viewpoint dependence. Exceptions in this relationship motivated us to propose the view-rate hypothesis: human visual performance is limited by the equivalent number of 2-D image views that can be processed per unit time. (C) 1998 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:2335 / 2350
页数:16
相关论文
共 40 条
[21]  
Liu ZL, 1996, SPATIAL VISION, V9, P491, DOI 10.1163/156856896X00222
[22]   FACE RECOGNITION, POSE AND ECOLOGICAL VALIDITY [J].
LOGIE, RH ;
BADDELEY, AD ;
WOODHEAD, MM .
APPLIED COGNITIVE PSYCHOLOGY, 1987, 1 (01) :53-69
[23]   REPRESENTATION AND RECOGNITION OF SPATIAL-ORGANIZATION OF 3-DIMENSIONAL SHAPES [J].
MARR, D ;
NISHIHARA, HK .
PROCEEDINGS OF THE ROYAL SOCIETY SERIES B-BIOLOGICAL SCIENCES, 1978, 200 (1140) :269-294
[26]   A NETWORK THAT LEARNS TO RECOGNIZE 3-DIMENSIONAL OBJECTS [J].
POGGIO, T ;
EDELMAN, S .
NATURE, 1990, 343 (6255) :263-266
[27]  
Press W. H., 1992, Numerical recipes in C++: The Art of Scientific Computing, V2nd
[28]   A CASE OF VIEWER-CENTERED OBJECT PERCEPTION [J].
ROCK, I ;
DIVITA, J .
COGNITIVE PSYCHOLOGY, 1987, 19 (02) :280-293
[29]   LOW-DIMENSIONAL PROCEDURE FOR THE CHARACTERIZATION OF HUMAN FACES [J].
SIROVICH, L ;
KIRBY, M .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1987, 4 (03) :519-524
[30]   DEFINITIONS OF D' AND ETA AS PSYCHOPHYSICAL MEASURES [J].
TANNER, WP ;
BIRDSALL, TG .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1958, 30 (10) :922-928