LEARNING TO PERCEIVE AND ACT BY TRIAL AND ERROR

被引：105

作者：

WHITEHEAD, SD

BALLARD, DH

机构：

来源：

MACHINE LEARNING | 1991年 / 7卷 / 01期

关键词：

REINFORCEMENT LEARNING; DEICTIC REPRESENTATIONS; SENSORY-MOTOR INTEGRATION; HIDDEN STATE; NON-MARKOV DECISION PROBLEMS;

D O I：

10.1007/BF00058926

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article considers adaptive control architectures that integrate active sensory-motor systems with decision systems based on reinforcement learning. One unavoidable consequence of active perception is that the agent's internal representation often confounds external world states. We call this phoenomenon perceptual aliasing and show that it destabilizes existing reinforcement learning algorithms with respect to the optimal decision policy. We then describe a new decision system that overcomes these difficulties for a restricted class on decision problems. The system incorporates a perceptual subcycle within the overall decision cycle and uses a modified learning algorithm to suppress the effects of perceptual aliasing. The result is a control architecture that learns not only how to solve a task but also where to focus its visual attention in order to collect necessary sensory information.

引用

页码：45 / 83

页数：39

共 58 条

[11] NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].

BARTO, AG ;

SUTTON, RS ;

ANDERSON, CW .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846

[12] LANDMARK LEARNING - AN ILLUSTRATION OF ASSOCIATIVE SEARCH [J].

BARTO, AG ;

SUTTON, RS .

BIOLOGICAL CYBERNETICS, 1981, 42 (01) :1-8

[13]

Bellman R. E., 1957, DYNAMIC PROGRAMMING

[14]

Bertsekas D.P., 1987, ABSTRACT DYNAMIC PRO

[15] TOWARD A MATHEMATICAL THEORY OF INDUCTIVE INFERENCE [J].

BLUM, L ;

BLUM, M .

INFORMATION AND CONTROL, 1975, 28 (02) :125-155

[16]

BLYTHE J, 1989, 6TH P INT WORKSH MAC, P255

[17]

BOOKER LB, 1982, THESIS U MICHIGAN

[18] A ROBUST LAYERED CONTROL-SYSTEM FOR A MOBILE ROBOT [J].

BROOKS, RA .

IEEE JOURNAL OF ROBOTICS AND AUTOMATION, 1986, 2 (01) :14-23

[19]

CHAPMAN D, 1989, AI MAG, V10, P45

[20]

CLOCKSIN WF, 1988, SOME EXPT ADAPTIVE S

← 1 2 3 4 5 6 →