首页 | 本学科首页   官方微博 | 高级检索  
     

基于分布式深度强化学习的微电网实时优化调度
引用本文:郭方洪,何通,吴祥,董辉,刘冰.基于分布式深度强化学习的微电网实时优化调度[J].控制理论与应用,2022,39(10):1881-1889.
作者姓名:郭方洪  何通  吴祥  董辉  刘冰
作者单位:浙江工业大学,浙江工业大学,浙江工业大学,浙江工业大学,浙江工业大学
基金项目:国家自然科学基金青年基金项目(61903333) , 浙江省“钱江人才”特殊急需类项目(QJD1902010)
摘    要:随着海量新能源接入到微电网中, 微电网系统模型的参数空间成倍增长, 其能量优化调度的计算难度不断上升. 同时, 新能源电源出力的不确定性也给微电网的优化调度带来巨大挑战. 针对上述问题, 本文提出了一种基于分布式深度强化学习的微电网实时优化调度策略. 首先, 在分布式的架构下, 将主电网和每个分布式电源看作独立智能体. 其次, 各智能体拥有一个本地学习模型, 并根据本地数据分别建立状态和动作空间, 设计一个包含发电成本、交易电价、电源使用寿命等多目标优化的奖励函数及其约束条件. 最后, 各智能体通过与环境交互来寻求本地最优策略, 同时智能体之间相互学习价值网络参数, 优化本地动作选择, 最终实现最小化微电网系统运行成本的目标. 仿真结果表明, 与深度确定性策略梯度算法(Deep Deterministic Policy Gradient, DDPG)相比, 本方法在保证系统稳定以及求解精度的前提下, 训练速度提高了17.6%, 成本函数值降低了67%, 实现了微电网实时优化调度.

关 键 词:深度强化学习    分布式优化    微电网    优化调度    优化算法
收稿时间:2021/9/30 0:00:00
修稿时间:2022/9/16 0:00:00

Real-time optimal scheduling for microgrid systems based on distributed deep reinforcement learning
GUO Fang-hong,HE Tong,WU Xiang,DONG Hui and LIU Bing.Real-time optimal scheduling for microgrid systems based on distributed deep reinforcement learning[J].Control Theory & Applications,2022,39(10):1881-1889.
Authors:GUO Fang-hong  HE Tong  WU Xiang  DONG Hui and LIU Bing
Affiliation:Zhejiang Univeristy of Technology,Zhejiang Univeristy of Technology,Zhejiang Univeristy of Technology,Zhejiang Univeristy of Technology,Zhejiang Univeristy of Technology
Abstract:With more and more renewable energy resources penetrating into the microgrid system, it results in doubling of the parameter space of the microgrid system model, and thus the computational complexity of its real-time optimal scheduling keeps rising. At the same time, the uncertainty of renewable energy resources also brings great challenges to the optimal scheduling problem of microgrids. To tackle above problems, this paper proposes a real-time optimal scheduling strategy for microgrid, which is based on distributed deep reinforcement learning approach. Firstly, under the distributed architecture, each distributed generator and main grid are treated as independent agents. Secondly, each agent has a local learning model, and it establishes its state and action space respectively based on local data. A multi-objective optimization reward function and constraint conditions are designed, which include power generation cost, transaction price, power supply life and so on. Finally, each agent seeks its optimal strategy by interacting with the environment, and meanwhile, agents learn value strategies from each other to optimize local action selection so as to minimize overall operation cost. The simulation results show that, compared to the deep deterministic strategy gradient algorithm, our method improves the training speed by 17.6% and reduces the cost function value by 67%, which meets the requirement of real-time optimal scheduling for microgrids, while ensuring the stability of the system and the accuracy of the solution.
Keywords:Deep reinforcement learning  distributed optimization    Microgrid  optimal scheduling  Optimization algorithm
点击此处可从《控制理论与应用》浏览原始摘要信息
点击此处可从《控制理论与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号