基于Q学习算法的综合能源系统韧性提升方法 Q-learning algorithm based method for enhancing resiliency of integrated energy system期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Q学习算法的综合能源系统韧性提升方法

引用本文：	吴熙,唐子逸,徐青山,周亦洲. 基于Q学习算法的综合能源系统韧性提升方法[J]. 电力自动化设备, 2020, 40(4): 146-152

作者姓名：	吴熙唐子逸徐青山周亦洲

作者单位：	东南大学电气工程学院，江苏南京 210096,东南大学电气工程学院，江苏南京 210096,东南大学电气工程学院，江苏南京 210096,河海大学能源与电气学院，江苏南京 210098

基金项目：	国家电网公司科技项目(SGJSJX00YJJS1800721)；国家自然科学基金重点资助项目(51936003)

摘要：	将综合能源系统随机动态优化问题建模为马尔可夫决策过程,并引入Q学习算法实现该复杂问题的求解。针对Q学习算法的弊端,对传统的Q学习算法做了2个改进:改进了Q值表初始化方法,采用置信区间上界算法进行动作选择。仿真结果表明:Q学习算法在实现问题求解的同时保证了较好的收敛性,改进的初始化方法和采用的置信区间上界算法能显著提高计算效率,使结果收敛到更优解;与常规混合整数线性规划模型相比,Q学习算法具有更好的优化结果。
关键词：	综合能源系统孤岛运行马尔可夫决策过程 Q学习算法韧性
收稿时间：	2019-04-11
修稿时间：	2019-12-10
Q-learning algorithm based method for enhancing resiliency of integrated energy system

WU Xi,TANG Ziyi,XU Qingshan and ZHOU Yizhou. Q-learning algorithm based method for enhancing resiliency of integrated energy system[J]. Electric Power Automation Equipment, 2020, 40(4): 146-152

Authors:	WU Xi TANG Ziyi XU Qingshan ZHOU Yizhou

Affiliation:	School of Electrical Engineering, Southeast University, Nanjing 210096, China,School of Electrical Engineering, Southeast University, Nanjing 210096, China,School of Electrical Engineering, Southeast University, Nanjing 210096, China and College of Energy and Electrical Engineering, Hohai University, Nanjing 210098, China

Abstract:	The stochastic dynamic optimization problem of integrated energy system is modeled as a Markov decision process, and Q-learning algorithm is introduced to solve this complex problem. In order to overcome the disadvantages of Q-learning algorithm, two improvements are made to the typical Q-learning: the Q table initialization method is improved and the upper bound convergence algorithm is adopted for the action selection. Simulative results show that Q-learning algorithm ensures better convergence while solving the problem, and the improved initialization method and the upper bound convergence algorithm can significantly improve the computational efficiency and make the results converge to a better solution. Moreover, compared with the conventional mixed integer linear programming model, Q-learning algorithm achieves better optimization results.

Keywords:	integrated energy system islanded operation Markov decision process Q-learning algorithm resiliency
本文献已被 CNKI 维普等数据库收录！
	点击此处可从《电力自动化设备》浏览原始摘要信息
	点击此处可从《电力自动化设备》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏