The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces

被引：24

作者：

Moore, AW ^{[1
]}

Atkeson, CG ^{[1
]}

机构：

[1] GEORGIA INST TECHNOL,COLL COMP,ATLANTA,GA 30332

来源：

MACHINE LEARNING | 1995年 / 21卷 / 03期

关键词：

reinforcement learning; curse of dimensionality; learning control; robotics; kd-trees;

D O I：

10.1023/A:1022656217772

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];

摘要：

Parti-game is a new algorithm for learning feasible trajectories to goal regions in high dimensional continuous state-spaces. In high dimensions it is essential that neither planning nor exploration occurs uniformly over a state-space. Parti-game maintains a decision-tree partitioning of state-space and applies techniques from game-theory and computational geometry to efficiently and adaptively concentrate high resolution only on critical areas. The current version of the algorithm is designed to find feasible paths or trajectories to goal regions in high dimensional spaces. Future versions will be designed to find a solution that optimizes a real-valued criterion. Many simulated problems have been tested, ranging from two-dimensional to nine-dimensional state-spaces, including mazes, path planning, non-linear dynamics, and planar snake robots in restricted spaces. In all cases, a good solution is found in less than ten trials and a few minutes.

引用

页码：199 / 233

页数：35

共 29 条

[1]

AKIAN M, 1988, 27TH P C DEC CONTR A

[2]

Arcilla AS, 1991, NUMERICAL GRID GENER

[3]

Barto A.G., 1983, IEEE T SYST MAN CYB, V13, P835

[4]

BARTO AG, 1994, IN PRESS AI J

[5]

BARTSEKAS DP, 1989, PARALLEL DISTRIBUTED

[6]

Bellman R. E., 1957, DYNAMIC PROGRAMMING

[7]

BROOKS RA, 1983, 8TH P INT C ART INT

[8]

CHAPMAN D, 1991, LEARNING DELAYED REI

[9]

CHOW CS, 1990, MULTIGRID ALGORITHMS

[10]

DAYAN P, 1993, ADV NEURAL INFORMATI, V5

← 1 2 3 →