首页 | 本学科首页   官方微博 | 高级检索  
     

基于分层相关均衡强化学习的CPS指令优化分配算法
引用本文:张孝顺,余涛,唐捷.基于分层相关均衡强化学习的CPS指令优化分配算法[J].电力系统自动化,2015,39(8):80-86.
作者姓名:张孝顺  余涛  唐捷
作者单位:1. 华南理工大学电力学院,广东省广州市,510640
2. 广东电网公司韶关供电局,广东省韶关市,512026
基金项目:国家重点基础研究发展计划(973计划)资助项目(2013CB228205);国家自然科学基金资助项目(51177051,51477055);中国南方电网规划研究项目
摘    要:提出了一种应用在控制性能标准(CPS)下自动发电控制(AGC)指令(CPS指令)由调度端至各台机组的动态分配过程的分层多智能体相关均衡(HCEQ)算法。根据机组调频时延对其进行聚类分层,有效解决了CPS指令分配过程的维数灾难问题。相比单智能体强化学习算法,HCEQ算法引入了均衡目标函数的求解,有效提高了算法寻优速度。将功率偏差、水电裕度和调节成本目标以线性加权的方法转化为算法奖励函数,研究了不同权值下CPS控制性能和调节成本的变化关系。南方电网模型仿真研究表明,HCEQ算法具有较快的收敛速度,在复杂随机扰动的环境中能有效提高系统CPS考核合格率,并有效降低AGC调节成本。

关 键 词:自动发电控制  多智能体系统  随机对策  强化学习
收稿时间:5/6/2014 12:00:00 AM
修稿时间:2014/10/29 0:00:00

Optimal CPS Command Dispatch Based on Hierarchically Correlated Equilibrium Reinforcement Learning
ZHANG Xiaoshun,YU Tao and TANG Jie.Optimal CPS Command Dispatch Based on Hierarchically Correlated Equilibrium Reinforcement Learning[J].Automation of Electric Power Systems,2015,39(8):80-86.
Authors:ZHANG Xiaoshun  YU Tao and TANG Jie
Affiliation:School of Electric Power, South China University of Technology, Guangzhou 510640, China,School of Electric Power, South China University of Technology, Guangzhou 510640, China and Shaoguan Power Supply Bureau, Shaoguan 512026, China
Abstract:A hierarchical correlated Q-learning (HCEQ) approach is presented to solve the dynamic optimization of generation command dispatch (GCD) for automatic generation control (AGC). In order to decrease the dimensions of GCD, the AGC units are classified into different clusters according to their time delay during frequency control. Compared with single-agent reinforcement learning, the HCEQ method introduces the solution of equilibrium objective function, which effectively improves the optimization speed. The generating error, hydropower capacity margin and AGC regulating cost are turned into the Markov decision process reward function via the linearly weighted aggregate algorithm. The application of the hierarchical correlated Q-learning algorithm in China southern power grid (CSG) model shows that the method proposed is capable of reducing the converging time in the pre-learning process and the AGC regulating cost while improving the control performance of AGC systems in a complicated environment of random perturbation. This work is supported by National Basic Research Program of China (973 Program) (No. 2013CB228205) and National Natural Science Foundation of China (No. 51177051, No. 51477055).
Keywords:automatic generation control  multi-agent system  stochastic game theory  reinforcement learning
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《电力系统自动化》浏览原始摘要信息
点击此处可从《电力系统自动化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号