Convolutional Neural Networks for Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control

被引：118

作者：

Claessens, Bert J. ^{[1
]}

Vrancx, Peter ^{[2
]}

Ruelens, Frederik ^{[3
]}

机构：

[1] Restore, B-2600 Antwerp, Belgium

[2] Vrije Univ Brussel, AI Lab, B-1050 Brussels, Belgium

[3] KU Leuven EnergyVille, Dept Elect Engn, B-3000 Leuven, Belgium

来源：

IEEE TRANSACTIONS ON SMART GRID | 2018年 / 9卷 / 04期

关键词：

Convolutional neural network; deep learning; demand response; reinforcement learning; MANAGEMENT;

D O I：

10.1109/TSG.2016.2629450

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

080906 [电磁信息功能材料与结构]; 082806 [农业信息与电气工程];

摘要：

Direct load control of a heterogeneous cluster of residential demand flexibility sources is a high-dimensional control problem with partial observability. This paper proposes a novel approach that uses a convolutional neural network (CNN) to extract hidden state-time features to mitigate the curse of partial observability. More specific, a CNN is used as a function approximator to estimate the state-action value function or Q-function in the supervised learning step of fitted Q-iteration. The approach is evaluated in a qualitative simulation, comprising a cluster of thermostatically controlled loads that only share their air temperature, while their envelope temperature remains hidden. The simulation results show that the presented approach is able to capture the underlying hidden features and able to successfully reduce the electricity cost the cluster.

引用

页码：3259 / 3269

页数：11

共 49 条

[1]

[Anonymous], 2018, ILOG CPLEX: High-performance software for mathematical programming and optimization

[2]

[Anonymous], 2015, GITHUB REPOS

[3]

[Anonymous], 2015, P BUILDING SIMULATIO

[4]

[Anonymous], 2013, THESIS

[5]

[Anonymous], P 2 MULT C REINF LEA

[6]

[Anonymous], P POW SYST COMP C PS

[7]

[Anonymous], 2011, P 28 INT C MACH LEAR

[8]

[Anonymous], 1997, Neural Computation

[9]

[Anonymous], ARXIV160505835

[10]

[Anonymous], 1996, Neuro-dynamic programming

← 1 2 3 4 5 →