Reinforced Genetic Programming

被引：14

作者：

Keith L. Downing

机构：

[1] The Norwegian University of Science and Technology,

来源：

Genetic Programming and Evolvable Machines | 2001年 / 2卷 / 3期

关键词：

genetic programming; reinforcement learning; the Baldwin Effect; Lamarckism;

D O I：

10.1023/A:1011953410319

中图分类号：

学科分类号：

摘要：

This paper introduces the Reinforced Genetic Programming (RGP) system, which enhances standard tree-based genetic programming (GP) with reinforcement learning (RL). RGP adds a new element to the GP function set: monitored action-selection points that provide hooks to a reinforcement-learning system. Using strong typing, RGP can restrict these choice points to leaf nodes, thereby turning GP trees into classify-and-act procedures. Then, environmental reinforcements channeled back through the choice points provide the basis for both lifetime learning and general GP fitness assessment. This paves the way for evolutionary acceleration via both Baldwinian and Lamarckian mechanisms. In addition, the hybrid hints of potential improvements to RL by exploiting evolution to design proper abstraction spaces, via the problem-state classifications of the internal tree nodes. This paper details the basic mechanisms of RGP and demonstrates its application on a series of static and dynamic maze-search problems.

引用

页码：259 / 288

页数：29

共 18 条

[1]

Baldwin J. M.(1896)How learning can guide evolution A new factor in evolution, American Naturalist 30 441-451

[2]

Hinton G. E.(1987)Empirical investigation of the benefits of partial Lamarckianism Complex Syst. 1 495-502

[3]

Nowlan S. J.(1997)Deisgning neutral networks using genetic algorithms with graph generation system Evolutionary Comput. 5 31-60

[4]

Houck C. R.(1990)Toward optimal classifier system performance in non-markov environments Complex syst. 4 461-467

[5]

Joines J. A.(2000)A tale of two classifier systems Evolution Comput. 8 393-418

[6]

Kay M. G.(1988)Temporal difference learning and TD-Gammon Machine Learning 3 139-159

[7]

Wilson J. R.(1995)Introduction to the special issue: Evolution, learning, and instinct: 100 years of the Baldwin effect Commun. ACM 38 58-68

[8]

Kitano H.(1997)Q-learning Evolutionary Comput. 4 iv-viii

[9]

Lanzi P. L.(1992)undefined Machine Learning 8 297-292

[10]

Wilson S. W.(undefined)undefined undefined undefined undefined-undefined

← 1 2 →