基于深度强化学习算法的“电网脑”及其示范工程应用 Deep reinforcement learning-based grid mind and field demonstration application期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度强化学习算法的“电网脑”及其示范工程应用

引用本文：	徐春雷,吴海伟,刁瑞盛,胡浔惠,李雷,史迪.基于深度强化学习算法的“电网脑”及其示范工程应用[J].电力需求侧管理,2021,23(4):73-78.

作者姓名：	徐春雷吴海伟刁瑞盛胡浔惠李雷史迪

作者单位：	国网江苏省电力有限公司,南京210024;智博能源科技(江苏)有限公司,南京211302;国电南瑞科技股份有限公司,南京211106

基金项目：	国网江苏省电力有限公司科技项目（J2020058）

摘要：	可再生能源、电力电子设备渗透率持续增大以及大功率交直流混联,电网的动态性、随机性和不确定性显著增强,给电力系统安全稳定运行带来新的挑战.为更有效解决电网中出现的电压、潮流快速波动而导致的安全问题,提出一种基于最大熵深度强化学习算法的智能电网调控辅助决策方法,同时考虑多种控制目标,对电网运行方式进行在线优化控制.该方法将电网调度控制决策建模为马尔科夫决策过程,训练多线程智能体,并采用周期性在线训练机制对智能体的控制性能进行不断提升.基于该方法所研发的辅助决策原型软件部署在国网江苏电力调度控制中心,可与电网调度控制系统环境直接交互,自主学习且不断提升智能体调控决策能力.训练好的智能体可针对电压越限、联络线潮流越限、网损等综合控制目标在毫秒级时间内给出有效控制策略.
关键词：	人工智能智能调控深度强化学习电网安全
收稿时间：	2021/3/2 0:00:00
修稿时间：	2021/5/30 0:00:00
Deep reinforcement learning-based grid mind and field demonstration application

XU Chunlei,WU Haiwei,DIAO Ruisheng,HU Xunhui,LI Lei,SHI Di.Deep reinforcement learning-based grid mind and field demonstration application[J].Power Demand Side Management,2021,23(4):73-78.

Authors:	XU Chunlei WU Haiwei DIAO Ruisheng HU Xunhui LI Lei SHI Di

Affiliation:	State Grid Jiangsu Electric Power Co., Ltd., Nanjing 210024, China;Zhibo Energy Technology Co., Ltd.,Nanjing 211302, China;NARI Group Co., Ltd., Nanjing 211106, China

Abstract:	With the increasing penetration of renewable energy and power electronics-based devices, and the hybrid operation of AC/DC power networks with heavy power transfer, the dynamics,stochastics and uncertainties of the power grid are being observed,threatening its secure operation. In order to effectively resolve security issues caused by fast variations of voltage and line flows, a reinforcement learning algorithm based on maximum entropy depth ispresented for providing online decision support in smart grid operation, which can simultaneously consider multiple control objectives.This method formulates decision derivation for grid operation as Markov decision process, which trains multi-threaded soft actor-critic and uses periodic online training mechanism to continuously improve its control performance. The developed prototype using this method has been deployed in the control center of SGCC Jiangsu electric power company, which interacts with live energy management system and learns its control policy adaptively. The well -trained agent can provide effective control actions within milliseconds to regulate voltage violation, line flow and losses.

Keywords:
本文献已被万方数据等数据库收录！
	点击此处可从《电力需求侧管理》浏览原始摘要信息
	点击此处可从《电力需求侧管理》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏