Biped dynamic walking using reinforcement learning

被引:96
作者
Benbrahim, H
Franklin, JA
机构
[1] GTE Labs Inc, Waltham, MA 02254 USA
[2] Mt Holyoke Coll, Dept Comp Sci, S Hadley, MA 01075 USA
关键词
biped walking; reinforcement learning; robot learning; biped robot; legged robot;
D O I
10.1016/S0921-8890(97)00043-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents some results from a study of biped dynamic walking using reinforcement learning. During this study a hardware biped robot was built, a new reinforcement learning algorithm as well as a new learning architecture were developed. The biped learned dynamic walking without any previous knowledge about its dynamic model. The self scaling reinforcement (SSR) learning algorithm was developed in order to deal with the problem of reinforcement learning in continuous action domains. The learning architecture was developed in order to solve complex control problems. It uses different modules that consist of simple controllers and small neural networks. The architecture allows for easy incorporation of new modules that represent new knowledge, or new requirements for the desired task.
引用
收藏
页码:283 / 302
页数:20
相关论文
共 37 条
[1]  
Albus J. S., 1975, Transactions of the ASME. Series G, Journal of Dynamic Systems, Measurement and Control, V97, P228, DOI 10.1115/1.3426923
[2]  
ALBUS JS, 1975, ASME, P220
[3]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[4]  
BARTO AG, 1990, P 6 YAL WORKSH AD LE
[5]  
BAY JS, 1987, IEEE T BIOMEDICAL EN, V34
[6]  
BENBRAHIM H, 1994, P 8 YAL WORKSH AD LE
[7]  
BENBRAHIM H, 1992, P INT JOINT C NEUR N, V1, P98
[8]  
Christopher John Cornish Hellaby Watkins, 1989, LEARNING DELAYED REW
[9]  
Franklin J. A., 1988, P 27 IEEE C DEC CONT
[10]  
FRANKLINJA, 1989, P 27 IEEE C DEC CONT