Magnetic control of tokamak plasmas through deep reinforcement learning

被引:433
作者
Degrave, Jonas [1 ]
Felici, Federico [2 ]
Buchli, Jonas [1 ]
Neunert, Michael [1 ]
Tracey, Brendan [1 ]
Carpanese, Francesco [1 ,2 ]
Ewalds, Timo [1 ]
Hafner, Roland [1 ]
Abdolmaleki, Abbas [1 ]
de las Casas, Diego [1 ]
Donner, Craig [1 ]
Fritz, Leslie [1 ]
Galperti, Cristian [2 ]
Huber, Andrea [1 ]
Keeling, James [1 ]
Tsimpoukelli, Maria [1 ]
Kay, Jackie [1 ]
Merle, Antoine [2 ]
Moret, Jean-Marc [2 ]
Noury, Seb [1 ]
Pesamosca, Federico [2 ]
Pfau, David [1 ]
Sauter, Olivier [2 ]
Sommariva, Cristian [2 ]
Coda, Stefano [2 ]
Duval, Basil [2 ]
Fasoli, Ambrogio [2 ]
Kohli, Pushmeet [1 ]
Kavukcuoglu, Koray [1 ]
Hassabis, Demis [1 ]
Riedmiller, Martin [1 ]
机构
[1] DeepMind, London, England
[2] Ecole Polytech Fed Lausanne, Swiss Plasma Ctr, Lausanne, Switzerland
基金
瑞士国家科学基金会;
关键词
D O I
10.1038/s41586-021-04301-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Nuclear fusion using magnetic confinement, in particular in the tokamak configuration, is a promising path towards sustainable energy. A core challenge is to shape and maintain a high-temperature plasma within the tokamak vessel. This requires high-dimensional, high-frequency, closed-loop control using magnetic actuator coils, further complicated by the diverse requirements across a wide range of plasma configurations. In this work, we introduce a previously undescribed architecture for tokamak magnetic controller design that autonomously learns to command the full set of control coils. This architecture meets control objectives specified at a high level, at the same time satisfying physical and operational constraints. This approach has unprecedented flexibility and generality in problem specification and yields a notable reduction in design effort to produce new plasma configurations. We successfully produce and control a diverse set of plasma configurations on the Tokamak a Configuration Variable(1,2), including elongated, conventional shapes, as well as advanced configurations, such as negative triangularity and 'snowflake' configurations. Our approach achieves accurate tracking of the location, current and shape for these configurations. We also demonstrate sustained 'droplets' on TCV, in which two separate plasmas are maintained simultaneously within the vessel. This represents a notable advance for tokamak feedback control, showing the potential of reinforcement learning to accelerate research in the fusion domain, and is one of the most challenging real-world systems to which reinforcement learning has been applied.
引用
收藏
页码:414 / +
页数:20
相关论文
共 54 条
  • [41] Tokamak equilibrium reconstruction code LIUQE and its real time implementation
    Moret, J. -M.
    Duval, B. P.
    Le, H. B.
    Coda, S.
    Felici, F.
    Reimerdes, H.
    [J]. FUSION ENGINEERING AND DESIGN, 2015, 91 : 1 - 15
  • [42] Muldal A, 2019, dm_env: A Python interface for reinforcement learning environments
  • [43] Paley J. I., 2010, 2010 17 IEEE NPSS RE, P1
  • [44] Reynolds M, 2017, SONNET TENSORFLOW BA
  • [45] Feedforward beta control in the KSTAR tokamak by deep reinforcement learning
    Seo, Jaemin
    Na, Y. S.
    Kim, B.
    Lee, C. Y.
    Park, M. S.
    Park, S. J.
    Lee, Y. H.
    [J]. NUCLEAR FUSION, 2021, 61 (10)
  • [46] Ulyanov D, 2016, ARXIV
  • [47] Fast modeling of turbulent transport in fusion plasmas using neural networks
    van de Plassche, K. L.
    Citrin, J.
    Bourdelle, C.
    Camenen, Y.
    Casson, F. J.
    Dagnelie, V., I
    Felici, F.
    Ho, A.
    Van Mulders, S.
    [J]. PHYSICS OF PLASMAS, 2020, 27 (02)
  • [48] Probabilistic Model Predictive Safety Certification for Learning-Based Control
    Wabersich, Kim J.
    Hewing, Lukas
    Carron, Andrea
    Zeilinger, Melanie N.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (01) : 176 - 188
  • [49] Ion temperature gradient control using reinforcement learning technique
    Wakatsuki, T.
    Suzuki, T.
    Oyama, N.
    Hayashi, N.
    [J]. NUCLEAR FUSION, 2021, 61 (04)
  • [50] Safety factor profile control with reduced central solenoid flux consumption during plasma current ramp-up phase using a reinforcement learning technique
    Wakatsuki, T.
    Suzuki, T.
    Hayashi, N.
    Oyama, N.
    Ide, S.
    [J]. NUCLEAR FUSION, 2019, 59 (06)