首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的配电网多时间尺度在线无功优化
引用本文:倪爽,崔承刚,杨宁,陈辉,奚培锋,李振坤.基于深度强化学习的配电网多时间尺度在线无功优化[J].电力系统自动化,2021,45(10):77-85.
作者姓名:倪爽  崔承刚  杨宁  陈辉  奚培锋  李振坤
作者单位:上海电力大学自动化工程学院,上海市 200090;上海市智能电网需求响应重点实验室,上海市 200063;上海电力大学电气工程学院,上海市 200090
基金项目:国家自然科学基金青年科学基金项目(51607111)。
摘    要:含分布式电源的配电网存在潮流建模不精确、通信条件差、各无功补偿设备难以协调等问题,给配电网在线无功优化带来了挑战.文中采用深度强化学习方法,提出了一种多时间尺度配电网在线无功优化运行方案.该方案将配电网在线无功优化问题转化为马尔可夫决策过程.鉴于不同无功补偿设备的调节速度不同,设计2个时间尺度分别对离散调节设备和连续调节设备进行优化配置.该方案能够实时追踪配电网状态,在线决策无功调节设备的优化方案,且不依赖精确的潮流模型,适用于复杂多变、通信条件差的部分可观测配电网.最后,通过算例验证了所提方法的有效性和鲁棒性.

关 键 词:配电网  深度强化学习  马尔可夫决策过程  网络损耗  多时间尺度无功优化
收稿时间:2020/8/30 0:00:00
修稿时间:2020/11/27 0:00:00

Multi-time-scale Online Optimization for Reactive Power of Distribution Network Based on Deep Reinforcement Learning
NI Shuang,CUI Chenggang,YANG Ning,CHEN Hui,XI Peifeng,LI Zhenkun.Multi-time-scale Online Optimization for Reactive Power of Distribution Network Based on Deep Reinforcement Learning[J].Automation of Electric Power Systems,2021,45(10):77-85.
Authors:NI Shuang  CUI Chenggang  YANG Ning  CHEN Hui  XI Peifeng  LI Zhenkun
Affiliation:1.College of Automation Engineering, Shanghai University of Electrical Power, Shanghai 200090, China;2.Shanghai Key Laboratory of Smart Grid Demand Response, Shanghai 200063, China;3.College of Electrical Engineering, Shanghai University of Electrical Power, Shanghai 200090, China
Abstract:The distribution network with distributed generators has problems such as inaccurate power flow modeling, power communication conditions, and difficulty in coordination of various reactive power compensation equipment. The problems bring challenges to the online optimization for reactive power of the distribution network. This paper proposes a multi-time-scale online optimal operation scheme for reactive power of the distribution network based on the method of deep reinforcement learning (DRL). The scheme converts the problem of the online optimization for reactive power of the distribution network into a Markov decision process (MDP). In view of the different adjustment speeds of different reactive power compensation equipment, two time scales are designed to optimize the configuration of the discrete adjustment equipment and the continuous adjustment equipment. This scheme can track the status of the distribution network in real time, make online decisions about the optimization for reactive power regulation equipment, and does not rely on accurate power flow models. It is suitable for partial observable distribution networks that are complex and changeable and have poor communication conditions. Finally, a numerical example verifies the effectiveness and robustness of the proposed method.
Keywords:distribution network  deep reinforcement learning (DRL)  Markov decision process (MDP)  network loss  multi-time-scale optimization for reactive power
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《电力系统自动化》浏览原始摘要信息
点击此处可从《电力系统自动化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号