首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度强化学习的激励型需求响应决策优化模型
引用本文:徐弘升,陆继翔,杨志宏,李昀,陆进军,黄华.基于深度强化学习的激励型需求响应决策优化模型[J].电力系统自动化,2021,45(14):97-103.
作者姓名:徐弘升  陆继翔  杨志宏  李昀  陆进军  黄华
作者单位:南瑞集团有限公司(国网电力科学研究院有限公司),江苏省南京市 211106;南瑞集团有限公司(国网电力科学研究院有限公司),江苏省南京市 211106;智能电网保护与运行控制国家重点实验室,江苏省南京市 211106
基金项目:国家重点研发计划资助项目(2018AAA0101504);国家电网公司科技项目(5700-202019364A-0-0-00)。
摘    要:随着中国电力市场化改革的推进,售电侧市场逐步开放,售电商可以聚合大量的分散负荷参与电力市场环境下的需求响应.文中提出以售电商和用户综合收益最大化为目标的基于深度强化学习的激励型需求响应建模和求解方法.首先,建立售电商和用户的需求响应模型,通过引入时间-价格弹性,改进现有的用户响应模型,考虑用户对相邻时段补贴价格差的反应.然后,基于马尔可夫决策过程框架构建补贴价格决策优化模型,并设计基于深度Q学习网络的求解算法.最后,以1个售电商和3个不同类型的用户为例进行仿真计算,通过分析算法收敛性和对比不同模型及参数下的优化结果,验证了改进模型的合理性和生成策略的有效性,并分析了激励型需求响应对售电商以及用户的影响.

关 键 词:激励型需求响应  价格弹性系数  深度强化学习  深度Q学习网络
收稿时间:2020/2/8 0:00:00
修稿时间:2021/1/19 0:00:00

Decision Optimization Model of Incentive Demand Response Based on Deep Reinforcement Learning
XU Hongsheng,LU Jixiang,YANG Zhihong,LI Yun,LU Jinjun,HUANG Hua.Decision Optimization Model of Incentive Demand Response Based on Deep Reinforcement Learning[J].Automation of Electric Power Systems,2021,45(14):97-103.
Authors:XU Hongsheng  LU Jixiang  YANG Zhihong  LI Yun  LU Jinjun  HUANG Hua
Affiliation:1.NARI Group Corporation (State Grid Electric Power Research institute), Nanjing 211106, China;2.State Key Laboratory of Smart Grid Protection and Control, Nanjing 211106, China
Abstract:With the advancement of the electricity market reform in China, electricity side market is gradually opening up. The electricity retailers can aggregate a large number of distributed loads to participate in the demand response in the electricity market environment. In this paper, the modeling and solving methods of incentive demand response based on deep reinforcement learning are proposed to maximize the comprehensive profits of both retailers and customers. Firstly, the demand response models for retailers and customers are established. By introducing the time-price elasticity, the recent customer response model is improved, which takes the customer response to the subsidy price difference between adjacent periods into account. Then, based on the Markov decision process framework, an optimization model of subsidy price decision is constructed, and a solution algorithm based on deep Q-learning network is designed. Finally, simulation calculation is performed using one retailer and three different types of customers as examples. By analyzing the convergence of the algorithm and comparing the optimization results of different models and parameters, the rationality of the improved model and the effectiveness of the generated strategy are verified, and the impact of incentive demand response on the retailer and customers is analyzed.
Keywords:incentive demand response  price elastic coefficient  deep reinforcement learning  deep Q-learning network
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《电力系统自动化》浏览原始摘要信息
点击此处可从《电力系统自动化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号