Evaluation of reinforcement learning control for thermal energy storage systems

被引:70
作者
Henze, GP [1 ]
Schoenmann, J [1 ]
机构
[1] Univ Nebraska, Omaha, NE 68182 USA
来源
HVAC&R RESEARCH | 2003年 / 9卷 / 03期
关键词
Computer simulation - Cooling - Learning systems - Office buildings - Optimal control systems - Reinforcement - Statistical methods;
D O I
10.1080/10789669.2003.10391069
中图分类号
O414.1 [热力学];
学科分类号
摘要
This paper describes a simulation-based investigation of machine-learning control for the supervisory control of building energy systems. Model-free reinforcement learning control is investigated for the operation of electrically driven cool thermal energy storage systems in commercial buildings. The reinforcement learning controller learns to charge and discharge a thermal storage tank based on the feedback it receives from past control actions. The learning agent interacts with its environment by commanding the thermal energy storage system and extracts cues about the environment solely based on the reinforcement feedback it receives, which in this study is the monetary cost of each control action. No prediction or system model is required Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. The controller learns to account for the time-dependent cost of electricity (both time-of-use and real-time pricing), the availability of thermal storage, part-load performance of the central chilled water plant, and weather conditions. Though reinforcement learning control proved sensitive to the selection of state variables, level of discretization, and learning rate, it effectively learns a difficult task of controlling thermal energy storage and displays good performance. The cost savings compare favorably with conventional cool storage control strategies but do not reach the level of predictive optimal control.
引用
收藏
页码:259 / 275
页数:17
相关论文
共 22 条
[1]  
Antsaklis P.J., 1993, INTRO INTELLIGENT AU
[2]  
Bertsekas D. P., 1996, Neuro Dynamic Programming, V1st
[3]  
CAREY CW, 1993, THESIS U WISCONSIN M
[4]  
DREES KH, 1996, INT J HVAC R RES, V2, P312
[5]   A STOCHASTIC REINFORCEMENT LEARNING ALGORITHM FOR LEARNING REAL-VALUED FUNCTIONS [J].
GULLAPALLI, V .
NEURAL NETWORKS, 1990, 3 (06) :671-692
[6]  
Henze G.P., 1997, INT J HVAC R RES, V3, P128
[7]  
HENZE GP, 1999, ASHRAE T, V105
[8]  
HENZE GP, 2002, P ASME INT SOL EN C
[9]  
HENZE GP, 2002, ENERGY BUILDINGS
[10]  
HENZE GP, 1998, J SOLAR ENERGY E NOV