首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的主动配电网高恢复力决策方法
引用本文:罗欣儿,杜进桥,田杰,刘安迪,王标,李妍,王少荣.基于深度强化学习的主动配电网高恢复力决策方法[J].南方电网技术,2022(1).
作者姓名:罗欣儿  杜进桥  田杰  刘安迪  王标  李妍  王少荣
作者单位:深圳供电局有限公司;华中科技大学强电磁工程与新技术国家重点实验室
基金项目:国家重点研发计划项目(2017YFB0902800)。
摘    要:随着全球极端天气事件频发,电力系统在极端自然灾害下恢复力的研究日益受到关注。本文提出基于深度强化学习的高恢复力决策方法,将极端灾害下配电网运行状态和线路故障状态作为观测状态集合,自学习智能体Agent在当前环境观测状态下寻求可行的决策策略进行动作,定义自学习Agent的回报函数以进行动作评价;采用观测状态数据,开展基于竞争深度Q网络(dueling deep Q network,DDQN)的深度强化学习(deep reinforcement learning,DRL)训练,智能体Agent通过试错学习方式选择动作,试错经验在估值函数Q矩阵中存储,实现状态到主动配电网实时故障恢复策略的非线性映射;最后结合改进的IEEE 33节点算例,基于蒙特卡罗法仿真随机故障场景,对所提出方法生成的故障恢复随机优化决策进行分析。结果表明:通过主动配电网的分布式电源、联络开关和可中断负荷的协调优化控制,可以有效提升极端灾害下供电能力。

关 键 词:马尔科夫决策过程  竞争深度Q网络(DDQN)  深度强化学习(DRL)  高恢复力  配电网

High Resilience Decision-Making Method of Active Distribution Network Based on Deep Reinforcement Learning
LUO Xiner,DU Jinqiao,TIAN Jie,LIU Andi,WANG Biao,LI Yan,WANG Shaorong.High Resilience Decision-Making Method of Active Distribution Network Based on Deep Reinforcement Learning[J].Southern Power System Technology,2022(1).
Authors:LUO Xiner  DU Jinqiao  TIAN Jie  LIU Andi  WANG Biao  LI Yan  WANG Shaorong
Affiliation:(Shenzhen Power Supply Co.,Ltd.,Shenzhen,Guangdong 518001,China;State Key Laboratory of Advanced Electromagnetic Engineering and Technology,Huazhong University of Science and Technology,Wuhan 430074,China)
Abstract:Extreme weather events are occurring with increasing frequency,the research on the resilience of power systems under extreme natural disasters has received more and more attention.This paper proposes a high resilience decision-making method based on deep reinforcement learning,and the operation state and line fault state of distribution network under extreme disasters are regarded as the observation state set.In the current environment observation state,self-learning agent seeks feasible decision-making strategies for action,and defines the return function of self-learning agent for action evaluation.Based on the observed state data,the deep reinforcement learning(DRL)training is carried out based on the dueling deep Q network(DDQN).The agent selects the action by trial and error learning,and the trial and error experience is stored in the evaluation function Q matrix to realize the nonlinear mapping from the state to the real-time fault recovery strategy of active distribution network.Finally,the improved IEEE 33 node system is taken as a typical case,based on Monte Carlo method,the random fault scene is simulated,and the random optimization decision of the fault recovery generated by the proposed method is analyzed.The case study show that through the coordinated and optimized control of distributed generation,tie switches and interruptible loads in the active distribution network,power supply capacity in extreme disasters can be effectively improved.
Keywords:Markov decision process(MDP)  dueling deep Q network(DDQN)  deep reinforcement learning(DRL)  high resilience  distribution network
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号