首页 | 本学科首页   官方微博 | 高级检索  
     

学习过程中共享经验的Q学习算法的研究
引用本文:乔林,罗杰.学习过程中共享经验的Q学习算法的研究[J].计算机科学,2012,39(5):213-216.
作者姓名:乔林  罗杰
作者单位:南京邮电大学自动化学院 南京210046
摘    要:主要以提高多智能体系统中Q学习算法的学习效率为研究目标,以追捕问题为研究平台,提出了一种基于共享经验的Q学习算法。该算法模拟人类的团队学习行为,各个智能体拥有共同的最终目标,即围捕猎物,同时每个智能体通过协商获得自己的阶段目标。在学习过程中把学习分为阶段性学习,每学习一个阶段,就进行一次阶段性总结,分享彼此好的学习经验,以便于下一阶段的学习。这样以学习快的、好的带动慢的、差的,进而提升总体的学习性能。仿真实验证明,在学习过程中共享经验的Q学习算法能够提高学习系统的性能,高效地收敛于最优策略。

关 键 词:Q学习算法  MAS  围捕问题  共享经验

Research on Q Learning Algorithm with Sharing Experience in Learning Process
QIAO Lin , LUO Jie.Research on Q Learning Algorithm with Sharing Experience in Learning Process[J].Computer Science,2012,39(5):213-216.
Authors:QIAO Lin  LUO Jie
Affiliation:(College of Automation,Nanjing University of Posts & Telecommunications,Nanjing 210046,China)
Abstract:The aim of the research is to improve the efficience of multi-agent Q-learing algorithm.This paper proposed a method of multi-agent Q-learning with sharing experience based on the pursuit problem.This algorithm simulats human behavior of a learning team,and all agents share a common utimate goal of capturing the prey,at the same time every agent gets their own milestones through negotiations.The learning process is divided into some stages.After a learning stage,there will be a stage summary.Then good learning experience will be shared with each other in order to facilitate the next stage of learning.The agents who learn fast and well can help the ones who learn slow and not well,so in this way the performance of the system is enhanced.The simulation results prove that the Q-learning algorithm with sharing experience in learning process can improve the performance of learning systems and efficient convergence to the optimal strategy.
Keywords:Q-learning algorithm  MAS  Pursuit problem  Sharing experience
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号