首页 | 本学科首页   官方微博 | 高级检索  
     

基于多Agent强化学习的多站点CSPS系统的协作Look-ahead控制
引用本文:唐昊,万海峰,韩江洪,周雷.基于多Agent强化学习的多站点CSPS系统的协作Look-ahead控制[J].自动化学报,2010,36(2):289-296.
作者姓名:唐昊  万海峰  韩江洪  周雷
作者单位:1.合肥工业大学 计算机与信息学院 合肥 230009
基金项目:国家自然科学基金项目(60873003);;教育部留学回国人员科研启动基金;;安徽省自然科学基金(090412046);;安徽高校省级自然科学研究重点项目(KJ2008A058)资助~~
摘    要:研究多站点传送带给料生产加工站(Conveyor-serviced production station, CSPS)系统的最优控制问题, 其优化目标是通过合理选择每个CSPS的Look-ahead控制策略, 实现整个系统的工件处理率最大.本文首先根据多Agent系统的反应扩散思想, 对每个Agent的原始性能函数进行改进, 引入了具有扩散功能的局域信息交互项(原始项看作具有反应功能); 并运用性能势理论, 构建一种适用于平均和折扣两种性能准则的Wolf-PHC多Agent学习算法, 以求解决策时刻不同步的多站点的协作Look-ahead控制策略. 最后,论文通过仿真实验验证了该算法的有效性,学习结果表明, 通过性能函数的改进,各工作站的负载平衡性得到改善, 整个系统的工件处理率也明显提高.

关 键 词:传送带给料生产加工站    Look-ahead控制    多Agent强化学习    性能函数
收稿时间:2008-12-18
修稿时间:2009-5-26

Coordinated Look-ahead Control of Multiple CSPS System by Multi-agent Reinforcement Learning
TANG Hao , WAN Hai-Feng HAN Jiang-Hong , ZHOU Lei .School of Computer , Information,Hefei University of Technology,Hefei .Engineering Research Center of Safety Critical Industry Measure , Control Technology,Ministry of Education,Hefei.Coordinated Look-ahead Control of Multiple CSPS System by Multi-agent Reinforcement Learning[J].Acta Automatica Sinica,2010,36(2):289-296.
Authors:TANG Hao  WAN Hai-Feng HAN Jiang-Hong  ZHOU Lei School of Computer  Information  Hefei University of Technology  Hefei Engineering Research Center of Safety Critical Industry Measure  Control Technology  Ministry of Education  Hefei
Affiliation:1.School of Computer and Information, Hefei University of Technology, Hefei 230009;2.Engineering Research Center of Safety Critical Industry Measure and Control Technology, Ministry of Education, Hefei 230009
Abstract:The optimal control problem of a multiple conveyor-serviced production station (CSPS) system is concerned. The objective is to maximize the part-processing rate of the entire system by choosing a suitable look-ahead control strategy for each CSPS. According to the reaction-diffusion mechanism of multi-agent systems, the original performance function of each agent is first modified by introducing an item with a diffusion function that denotes the interaction of local information (The original item is assumed to have a reaction function). Then, combined with the concept of performance potentials, a multi-agent algorithm, i.e., Wolf-PHC algorithm, is proposed to derive the coordinated look-ahead control strategy for systems with either discounted or average performance criteria, where the decision epoch of each agent is asynchronous. Finally, a simulation example is used to illustrate the effectiveness of the algorithm, and the simulation results show that due to the modification of the performance functions, the contributions of all the stations are well balanced, and the part-processing rate of the entire system is increased significantly.
Keywords:Conveyor-serviced production station (CSPS)  look-ahead control  multi-agent reinforcement learning  performance function
本文献已被 CNKI 等数据库收录!
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号