Deep Reinforcement Learning for Strategic Bidding in Electricity Markets

被引：235

作者：

Ye, Yujian ^{[1
,2
]}

Qiu, Dawei ^{[2
]}

Sun, Mingyang ^{[2
]}

Papadaskalopoulos, Dimitrios ^{[2
]}

Strbac, Goran ^{[2
]}

机构：

[1] St Johns Innovat Ctr, Fetch AI, Cambridge CB4 0WS, England

[2] Imperial Coll London, Dept Elect & Elect Engn, London SW7 2AZ, England

来源：

IEEE TRANSACTIONS ON SMART GRID | 2020年 / 11卷 / 02期

基金：

欧盟地平线“2020”; 英国工程与自然科学研究理事会;

关键词：

Bi-level optimization; deep neural networks; deep reinforcement learning; electricity markets; strategic bidding; unit commitment; POWER;

D O I：

10.1109/TSG.2019.2936142

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

080906 [电磁信息功能材料与结构]; 082806 [农业信息与电气工程];

摘要：

Bi-level optimization and reinforcement learning (RL) constitute the state-of-the-art frameworks for modeling strategic bidding decisions in deregulated electricity markets. However, the former neglects the market participants' physical non-convex operating characteristics, while conventional RL methods require discretization of state and/or action spaces and thus suffer from the curse of dimensionality. This paper proposes a novel deep reinforcement learning (DRL) based methodology, combining a deep deterministic policy gradient (DDPG) method with a prioritized experience replay (PER) strategy. This approach sets up the problem in multi-dimensional continuous state and action spaces, enabling market participants to receive accurate feedback regarding the impact of their bidding decisions on the market clearing outcome, and devise more profitable bidding decisions by exploiting the entire action domain, also accounting for the effect of non-convex operating characteristics. Case studies demonstrate that the proposed methodology achieves a significantly higher profit than the alternative state-of-the-art methods, and exhibits a more favourable computational performance than benchmark RL methods due to the employment of the PER strategy.

引用

页码：1343 / 1355

页数：13

共 48 条

[1]

Abadi Martin, 2016, Proceedings of OSDI '16: 12th USENIX Symposium on Operating Systems Design and Implementation. OSDI '16, P265

[2]

[Anonymous], 2017, XPRESS OPTIMIZER PYT

[3]

[Anonymous], 2014, Convex Optimiza- tion

[4]

[Anonymous], IEEE Transactions on Smart Grid

[5]

[Anonymous], 2018, MAN 11 EN ANC SERV M

[6]

Electricity producer offering strategies in day-ahead energy market with step-wise offers [J].

Bakirtzis, Anastasios G. ;

Ziogos, Nikolaos P. ;

Tellidou, Athina C. ;

Bakirtzis, Gregory A. .

IEEE TRANSACTIONS ON POWER SYSTEMS, 2007, 22 (04) :1804-1818

[7]

ON THE THEORY OF DYNAMIC PROGRAMMING [J].

BELLMAN, R .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1952, 38 (08) :716-719

[8]

Bergstra J., 2009, technical report 1337)

[9]

Boukhedimi I., 2018, P IEEE GLOB COMM C G, P1

[10]

CAISO, 2017, BUS PRACT MAN MARK I

← 1 2 3 4 5 →