ADDING TEMPORARY MEMORY TO ZCS

被引：59

作者：

CLIFF, D

ROSS, S

机构：

[1] University of Sussex, School of Cognitive and Computing Sciences, University of Sussex

[2] University of Sussex, School of Cognitive and Computing Sciences, University of Sussex

来源：

ADAPTIVE BEHAVIOR | 1994年 / 3卷 / 02期

关键词：

CLASSIFIER SYSTEMS; ZCS; MEMORY; FLAVA; ACTION CHAINS;

D O I：

10.1177/105971239400300201

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In a recent article, Wilson (1994) described a ''zeroth-level'' classifier system (ZCS). ZCS employs a reinforcement learning technique comparble to Q-learning (Watkins, 1989). This article presents results from the first reconstruction of ZCS. Having replicated Wilson's results, we extend ZCS in a manner suggested by Wilson: The original formulation of ZCS has no memory mechanisms, but Wilson (1994b) suggested how internal ''temporary memory'' registers could be added. We show results from adding one-bit and two-bit memory registers to ZCS. Our results demonstrate that ZCS can exploit memory facilities efficiently in non-Markov environments. We also show that the memoryless ZCS can converge on near-optimal stochastic solutions in non-Markov environments. We then present results from trials using ZCS in Markov environments that require increasingly long chains of actions before reward is received. Our results indicate that inaccurate overgeneral classifiers can interact with the classifier-generation mechanisms to cause catastrophic breakdowns in overall system performance. Basing classifier fitness on accuracy may alleviate this problem. We conclude that the memory mechanism in its current form is unlikely to scale well for situations requiring large amounts of temporary memory. Nevertheless, the abiliity to find stochastic solutions when there is insufficient memory might offset this problem somewhat.

引用

页码：101 / 150

页数：50

共 33 条

[1]

BOOKER LB, 1989, 3RD P INT C GEN ALG

[2]

BROOKS RA, 1992, PRACTICE AUTONOMOUS

[3] LANDMARK MAPS FOR HONEYBEES [J].

CARTWRIGHT, BA ;

COLLETT, TS .

BIOLOGICAL CYBERNETICS, 1987, 57 (1-2) :85-93

[4] LANDMARK LEARNING IN BEES - EXPERIMENTS AND MODELS [J].

CARTWRIGHT, BA ;

COLLETT, TS .

JOURNAL OF COMPARATIVE PHYSIOLOGY, 1983, 151 (04) :521-543

[5]

CLIFF D, 1993, ADAPT BEHAV, V2, P47

[6]

CLIFF D, 1994, CSRP339 U SUSS SCH C

[7] LANDMARK LEARNING AND GUIDANCE IN INSECTS [J].

COLLETT, TS .

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 1992, 337 (1281) :295-303

[8]

Colombetti Marco, 1994, Adaptive Behavior, V2, P247, DOI 10.1177/105971239400200302

[9] LEARNING AND BUCKET BRIGADE DYNAMICS IN CLASSIFIER SYSTEMS [J].

COMPIANI, M ;

MONTANARI, D ;

SERRA, R .

PHYSICA D, 1990, 42 (1-3) :202-212

[10]

DORIGO M, 1994, ANIMALS ANIMATS, V3

← 1 2 3 4 →