Application of SONQL for real-time learning of robot behaviors

被引：23

作者：

Carreras, Marc

Yuh, Junku

Baffle, Joan

Ridao, Pere

机构：

[1] Univ Girona, Inst Informat & Appl, Girona 17071, Spain

[2] Natl Sci Fdn, Arlington, VA 22230 USA

来源：

ROBOTICS AND AUTONOMOUS SYSTEMS | 2007年 / 55卷 / 08期

基金：

美国国家科学基金会;

关键词：

reinforcement learning; Q-learning; behavior-based robotics; autonomous vehicles;

D O I：

10.1016/j.robot.2007.03.003

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

080201 [机械制造及其自动化];

摘要：

This paper describes the Semi-Online Neural-Q-leaming (SONQL) algorithm designed for real-time learning of reactive robot behaviors. The Q-function is generalized by a multilayer neural network allowing the use of continuous states. The algorithm uses a database of the most recent learning samples to accelerate and improve the convergence. Each SONQL algorithm represents an independent, reactive and adaptive state-action mapping, which implements the function of a robot behavior for one degree of freedom (DOF). The generalization capability of the SONQL algorithm was demonstrated by computer simulation with the '' mountain-car '' benchmark. The SONQL was also investigated by experiment on a mobile robot for a target-following task. Experimental results show that the SONQL is promising for online robot learning. (c) 2007 Elsevier B.V. All rights reserved.

引用

页码：628 / 642

页数：15

共 34 条

[1]

[Anonymous], P 8 INT C MACH LEARN

[2]

Arkin RC., 1998, BEHAV BASED ROBOTICS

[3]

BAIRD L, 1995, MACH LEARN 12 INT C

[4]

BOYAN J, 1995, NIPS 7

[5]

A ROBUST LAYERED CONTROL-SYSTEM FOR A MOBILE ROBOT [J].

BROOKS, RA .

IEEE JOURNAL OF ROBOTICS AND AUTOMATION, 1986, 2 (01) :14-23

[6]

CARRERAS M, 2001, IEEE RSJ INT C INT R

[7]

GACHET D, 1994, IROS 94 MUN GERM, V1, P290

[8]

GASKETT C, 2002, THESIS AUSTR NATL U

[9]

GASKETT C, 1999, P 12 AUSTR JOINT C A

[10]

Haykin S., 1999, Neural Networks-A Comprehensive Foundation, V2nd ed.

← 1 2 3 4 →