共 6 条
- [1] Barto A.G., Bradtke S.J., Singh S.P., Learning to Act Using Real-Time Dynamic Programming, Artificial Intelligence, 72, pp. 81-138, (1995)
- [2] Barto A.S., Sutton R., Reinforcement Learning, (1997)
- [3] Bertsekas D.P., Tsitsiklis J.N., Neuro-Dynamic Programming, (1996)
- [4] Glover F., Taillard E., De Werra D., A User's Guide to Tabu Search, Annals of Operations Research, 41, pp. 3-28, (1993)
- [5] Pattipati K.R., Alexandridis M.G., Application of Heuristic Search and Information Theory to Sequential Fault Diagnosis, IEEE Transactions on Systems, Man, and Cybernetics, 20, pp. 872-887, (1990)
- [6] Tesauro G., Galperin G.R., On-Line Policy Improvement Using Monte Carlo Search, (1996)