Visual indexes, preconceptual objects, and situated vision

被引：346

作者：

Pylyshyn, ZW ^{[1
]}

机构：

[1] Rutgers State Univ, Rutgers Ctr Cognit Sci, New Brunswick, NJ 08903 USA

来源：

COGNITION | 2001年 / 80卷 / 1-2期

关键词：

early vision; visual attention; visual indexing; multiple object tracking; object-based attention; visual representation; indexicals; demonstrative reference; deictics; situated vision;

D O I：

10.1016/S0010-0277(00)00156-6

中图分类号：

B84 [心理学];

学科分类号：

04 ; 0402 ;

摘要：

This paper argues that a theory of situated vision, suited for the dual purposes of object recognition and the control of action, will have to provide something more than a system that constructs a conceptual representation from visual stimuli: it will also need to provide a special kind of direct (preconceptual, unmediated) connection between elements of a visual representation and certain elements in the world. Like natural language demonstratives (such as 'this' or 'that') this direct connection allows entities to be referred to without being categorized or conceptualized. Several reasons are given for why we need such a preconceptual mechanism which individuates and keeps track of several individual objects in the world. One is that early vision must pick out and compute the relation among several individual objects while ignoring their properties. Another is that incrementally computing and updating representations of a dynamic scene requires keeping track of token individuals despite changes in their properties or locations. It is then noted that a mechanism meeting theses requirements has already been proposed in order to account for a number of disparate empirical phenomena, including subitizing. search-subset selection and multiple object tracking (Pylyshyn et al., Canadian Journal of Experimental Psychology 48(2) (1994) 260). This mechanism, called a visual index or FINST, is briefly discussed and it is argued that viewing it as performing a demonstrative or preconceptual reference function has far-reaching implications not only for a theory of situated vision, but also for suggesting a new way to look at why the primitive individuation of visual objects, or proto-objects, is so central in computing visual representations. Indexing visual objects is also, according to this view, the primary means for grounding visual concepts and is a potentially fruitful way to look at the problem of visual integration across time and across saccades, as well as to explain how infants' numerical capacity might arise. (C) 2001 Elsevier Science B.V. All rights reserved.

引用

页码：127 / 158

页数：32

共 87 条

[1]

ACTON B, 1993, THESIS U W ONTARIO L

[2]

Agre P., 1997, Computation and Human Experience

[3]

Albert S. Bregman, 1990, AUDITORY SCENE ANAL, P411, DOI [DOI 10.1121/1.408434, DOI 10.7551/MITPRESS/1486.001.0001]

[4]

[Anonymous], 1982, VISION COMPUTATIONAL

[5] Evidence for split attentional foci [J].

Awh, E ;

Pashler, H .

JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 2000, 26 (02) :834-846

[6] Deictic codes for the embodiment of cognition [J].

Ballard, DH ;

Hayhoe, MM ;

Pook, PK ;

Rao, RPN .

BEHAVIORAL AND BRAIN SCIENCES, 1997, 20 (04) :723-+

[7] VISUAL-ATTENTION AND OBJECTS - EVIDENCE FOR HIERARCHICAL CODING OF LOCATION [J].

BAYLIS, GC ;

DRIVER, J .

JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1993, 19 (03) :451-470

[8] Tracking an object through feature space [J].

Blaser, E ;

Pylyshyn, ZW ;

Holcombe, AO .

NATURE, 2000, 408 (6809) :196-199

[9]

BLASER E, 1999, INVEST OPHTHALMOL, V40, P552

[10]

Brooks R.A., 1999, Cambrian Intelligence: The Early History of the New AI

← 1 2 3 4 5 6 7 8 9 →