再励学习——原理、算法及其在智能控制中的应用

被引:11
作者
阎平凡
机构
[1] 清华大学自动化系北京
关键词
再励学习; 学习控制; 智能控制;
D O I
10.13976/j.cnki.xk.1996.01.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
综述了再励学习(Reinforcement Learning)的原理,主要算法,基于神经网络的实现及其在智能控制中的作用,探讨了应进一步研究的问题.
引用
收藏
页码:28 / 34+43 +43
页数:8
相关论文
共 24 条
[1]  
On the Convergence of Stochastic Iterative Dynamic Programming Algorithms. Jaakkola T,Jordan M,Singh S. Neural Computation . 1994
[2]  
Reinforcement Learning for the Adaptive Control of Nonlinear Systems. Albert Y Zomaya. IEEE Transactions on Systems Man and Cybernetics . 1994
[3]  
Associative Reinforcement Learning: Functions in k -DNF[J] . Leslie Pack Kaelbling. &nbspMachine Learning . 1994 (3)
[4]  
Associative Reinforcement Learning: A Generate and Test Algorithm[J] . Leslie Pack Kaelbling. &nbspMachine Learning . 1994 (3)
[5]  
Technical Note: Q-Learning[J] . Christopher J.C.H. Watkins,Peter Dayan. &nbspMachine Learning . 1992 (3)
[6]  
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning[J] . Ronald J. Williams. &nbspMachine Learning . 1992 (3)
[7]  
Learning to predict by the methods of temporal differences[J] . Richard S. Sutton. &nbspMachine Learning . 1988 (1)
[8]  
Simple Statistical Gradient-Following Algorithm for Connectionist Reinforcement Learning. Williams R J. Machine Learning . 1982
[9]  
Technical Note,Q-Learning. Watkins J C H. Machine Learning . 1992
[10]  
Learning by the Method of Temporal Differences. Sutton R S. Machine Learning . 1988