NEUROCONTROLLERS TRAINED WITH RULES EXTRACTED BY A GENETIC ASSISTED REINFORCEMENT LEARNING-SYSTEM

被引:24
作者
ABUZITAR, R [1 ]
HASSOUN, MH [1 ]
机构
[1] WAYNE STATE UNIV,DEPT ELECT & COMP ENGN,COMPUTAT & NEURAL NETWORKS LAB,DETROIT,MI 48202
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 1995年 / 6卷 / 04期
基金
美国国家科学基金会;
关键词
D O I
10.1109/72.392249
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a novel system for rule extraction of temporal control problems and presents a new way of designing neurocontrollers. The system employs a hybrid genetic search and reinforcement learning strategy. for extracting the rules. The learning strategy requires no supervision and no reference model, The extracted rules are weighted micro rules that operate on small neighborhoods of the admissable control space. A further refinement of the extracted rules is achieved by applying additional genetic search and reinforcement to reduce the number of extracted micro rules, This process results in a smaller set of macro rules which can be used to train a feedforward multilayer perceptron neurocontroller. The micro rules or the macro rules may also be utilized directly in a table look-up controller, As an example of the macro rules-based neurocontroller, we chose four benchmarks. In the first application we verify the capability of our system to learn optimal linear control strategies, The other three applications involve engine idle speed control, bioreactor control, and stabilizing two poles on a moving cart, These problems are highly nonlinear, unstable, and may include noise and delays in the plant dynamics, In terms of retrievals, the neurocontrollers generally outperform the controllers using a table look-up method, Both controllers, though, show robustness against noise disturbances and plant parameter variations.
引用
收藏
页码:859 / 879
页数:21
相关论文
共 20 条
[1]  
ABUZITAR RA, 1993, 5TH P INT C GEN ALG, P251
[2]  
AGRAWAL P, CHEM ENG SCI, V37, P453
[3]  
FELDKAMP LA, 1992, P INT JOINT C NEUR N, V2, P798
[4]  
Fogel D., 1991, SYSTEM IDENTIFICATIO
[5]  
Goldberg DE, 1989, GENETIC ALGORITHMS S
[6]  
HOLLAND J, 1993, ADAPTATION NATURAL A
[7]   A MATHEMATICAL FRAMEWORK FOR STUDYING LEARNING IN CLASSIFIER SYSTEMS [J].
HOLLAND, JH .
PHYSICA D-NONLINEAR PHENOMENA, 1986, 22 (1-3) :307-317
[8]   ADAPTIVE FUZZY-SYSTEMS FOR BACKING UP A TRUCK-AND-TRAILER [J].
KONG, SG ;
KOSKO, B .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (02) :211-223
[9]  
Narendra K S, 1990, IEEE Trans Neural Netw, V1, P4, DOI 10.1109/72.80202
[10]   GRADIENT METHODS FOR THE OPTIMIZATION OF DYNAMIC-SYSTEMS CONTAINING NEURAL NETWORKS [J].
NARENDRA, KS ;
PARTHASARATHY, K .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1991, 2 (02) :252-262