基于深度强化学习的配电网多时间尺度在线无功优化 Multi-time-scale Online Optimization for Reactive Power of Distribution Network Based on Deep Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度强化学习的配电网多时间尺度在线无功优化

引用本文：	倪爽,崔承刚,杨宁,陈辉,奚培锋,李振坤.基于深度强化学习的配电网多时间尺度在线无功优化[J].电力系统自动化,2021,45(10):77-85.

作者姓名：	倪爽崔承刚杨宁陈辉奚培锋李振坤

作者单位：	上海电力大学自动化工程学院,上海市 200090;上海市智能电网需求响应重点实验室,上海市 200063;上海电力大学电气工程学院,上海市 200090

基金项目：	国家自然科学基金青年科学基金项目（51607111）。

摘要：	含分布式电源的配电网存在潮流建模不精确、通信条件差、各无功补偿设备难以协调等问题,给配电网在线无功优化带来了挑战.文中采用深度强化学习方法,提出了一种多时间尺度配电网在线无功优化运行方案.该方案将配电网在线无功优化问题转化为马尔可夫决策过程.鉴于不同无功补偿设备的调节速度不同,设计2个时间尺度分别对离散调节设备和连续调节设备进行优化配置.该方案能够实时追踪配电网状态,在线决策无功调节设备的优化方案,且不依赖精确的潮流模型,适用于复杂多变、通信条件差的部分可观测配电网.最后,通过算例验证了所提方法的有效性和鲁棒性.
关键词：	配电网深度强化学习马尔可夫决策过程网络损耗多时间尺度无功优化
收稿时间：	2020/8/30 0:00:00
修稿时间：	2020/11/27 0:00:00
Multi-time-scale Online Optimization for Reactive Power of Distribution Network Based on Deep Reinforcement Learning

NI Shuang,CUI Chenggang,YANG Ning,CHEN Hui,XI Peifeng,LI Zhenkun.Multi-time-scale Online Optimization for Reactive Power of Distribution Network Based on Deep Reinforcement Learning[J].Automation of Electric Power Systems,2021,45(10):77-85.

Authors:	NI Shuang CUI Chenggang YANG Ning CHEN Hui XI Peifeng LI Zhenkun

Affiliation:	1.College of Automation Engineering, Shanghai University of Electrical Power, Shanghai 200090, China;2.Shanghai Key Laboratory of Smart Grid Demand Response, Shanghai 200063, China;3.College of Electrical Engineering, Shanghai University of Electrical Power, Shanghai 200090, China

Abstract:	The distribution network with distributed generators has problems such as inaccurate power flow modeling, power communication conditions, and difficulty in coordination of various reactive power compensation equipment. The problems bring challenges to the online optimization for reactive power of the distribution network. This paper proposes a multi-time-scale online optimal operation scheme for reactive power of the distribution network based on the method of deep reinforcement learning (DRL). The scheme converts the problem of the online optimization for reactive power of the distribution network into a Markov decision process (MDP). In view of the different adjustment speeds of different reactive power compensation equipment, two time scales are designed to optimize the configuration of the discrete adjustment equipment and the continuous adjustment equipment. This scheme can track the status of the distribution network in real time, make online decisions about the optimization for reactive power regulation equipment, and does not rely on accurate power flow models. It is suitable for partial observable distribution networks that are complex and changeable and have poor communication conditions. Finally, a numerical example verifies the effectiveness and robustness of the proposed method.

Keywords:	distribution network deep reinforcement learning (DRL) Markov decision process (MDP) network loss multi-time-scale optimization for reactive power
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《电力系统自动化》浏览原始摘要信息
	点击此处可从《电力系统自动化》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏