首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的居民实时自治最优能量管理策略
引用本文:叶宇剑,王卉宇,汤奕,Goran STRBAC.基于深度强化学习的居民实时自治最优能量管理策略[J].电力系统自动化,2022,46(1):110-119.
作者姓名:叶宇剑  王卉宇  汤奕  Goran STRBAC
作者单位:1.东南大学电气工程学院,江苏省南京市 210096;2.伦敦帝国理工学院电气与电子工程系,伦敦 SW72AZ,英国
基金项目:国家自然科学基金资助项目(51877037)。
摘    要:随着居民分布式资源的普及,如何考虑用户多类型设备的运行特性,满足实时自治能量管理需求以达到用户侧经济性最优成为亟待解决的课题。传统基于模型的最优化方法在模型精准构建和应对多重不确定性等方面存在局限性,为此提出一种无模型的基于深度强化学习的实时自治能量管理优化方法。首先,对用户设备进行分类,采用统一的三元组描述其运行特性,并确定相应的能量管理动作;接着,采用长短期记忆神经网络提取环境状态中多源时序数据的未来走势;进而,基于近端策略优化算法,赋能在多维连续-离散混合的动作空间中高效学习最优能量管理策略,在最小化用电成本的同时提升策略对不确定性的适应性;最后,通过实际情境对比现有方法的优化决策效果,验证所提方法的有效性。

关 键 词:实时自治能量管理优化  不确定性  连续-离散混合动作  长短期记忆神经网络  深度强化学习
收稿时间:2021/6/28 0:00:00
修稿时间:2021/10/12 0:00:00

Real-time Autonomous Optimal Energy Management Strategy for Residents Based on Deep Reinforcement Learning
YE Yujian,WANG Huiyu,TANG Yi,Goran STRBAC.Real-time Autonomous Optimal Energy Management Strategy for Residents Based on Deep Reinforcement Learning[J].Automation of Electric Power Systems,2022,46(1):110-119.
Authors:YE Yujian  WANG Huiyu  TANG Yi  Goran STRBAC
Affiliation:1.School of Electrical Engineering, Southeast University, Nanjing 210096, China;2.Department of Electrical and Electronic Engineering, Imperial College London, London SW72AZ, UK
Abstract:Alongside the wide proliferation of distributed energy resources at the residential sector, how to meet the needs of real-time autonomous energy management while considering the heterogeneous operating characteristics of these resources so as to maximize the utility for residential end-users deserves significant research attention. In this area, conventional model-based optimization methods are generally burdened with inaccurate system modeling and inability to efficiently deal with uncertainties stemmed from multiple sources. In order to address these challenges, this paper proposes a model-free method based on deep reinforcement learning to achieve real-time autonomous energy management optimization. First, the user''s resources are classified into different categories, their operating characteristics are then described using a unified 3-element tuple, and the associated energy management actions are also identified. Next, the long short-term memory neural network is employed to extract the future trends of multi-source sequential data from the environment states. Then, based on the proximal policy optimization algorithm,it enables efficient learning of the optimal energy management policies in the multi-dimensional continuous-discrete mixed action space, which can adaptively adjust to system uncertainties towards the user''s electricity cost minimization objective. Finally, the effectiveness of the proposed method is verified by benchmarking its performance against several existing methods through case studies on an actual scenario.
Keywords:real-time autonomous energy management optimization  uncertainties  continuous-discrete mixed actions  long short-term memory neural network  deep reinforcement learning
本文献已被 维普 等数据库收录!
点击此处可从《电力系统自动化》浏览原始摘要信息
点击此处可从《电力系统自动化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号