首页 | 本学科首页   官方微博 | 高级检索  
     

未知环境下基于虚拟子目标的对立Q学习机器人路径规划
引用本文:汪盛民,林伟,曾碧.未知环境下基于虚拟子目标的对立Q学习机器人路径规划[J].广东工业大学学报,2019,36(1):51-56,62.
作者姓名:汪盛民  林伟  曾碧
作者单位:广东工业大学 计算机学院,广东 广州,510006;广东工业大学 计算机学院,广东 广州,510006;广东工业大学 计算机学院,广东 广州,510006
基金项目:广东省产学研合作专项项目(2014B090904080);广东省应用型科技研发专项项目(2015B090922012);东莞市产学研合作项目(2015509109107)
摘    要:针对Q学习算法在复杂的未知环境下Q值更新速度慢,容易产生维数灾难等问题,提出了一种未知环境下基于虚拟子目标的对立Q学习机器人路径规划算法.该算法根据移动机器人探索过的状态轨迹,建立了2个状态链分别记录状态-动作对和状态-反向动作对,并将每个单链当前状态的Q值,依次反馈影响前一状态的Q值,直到状态链的头端.同时,在局部探测域内通过寻找最优虚拟子目标的方法解决了大规模环境下Q学习容易产生维数灾难的问题.实验结果表明,在复杂的未知环境中,该算法可以有效地加快算法学习的收敛速度,提高学习效率,以较优的路径完成机器人导航任务.

关 键 词:移动机器人  虚拟子目标  对立Q学习  未知环境
收稿时间:2018-03-16

Path Planning of Opposite Q Learning Robot Based on Virtual Sub-Target in Unknown Environment
Wang Sheng-min,Lin Wei,Zeng Bi.Path Planning of Opposite Q Learning Robot Based on Virtual Sub-Target in Unknown Environment[J].Journal of Guangdong University of Technology,2019,36(1):51-56,62.
Authors:Wang Sheng-min  Lin Wei  Zeng Bi
Affiliation:School of Computers, Guangdong University of Technology, Guangzhou 510006, China
Abstract:Aiming at the problem that in Q learning algorithm Q value is slow in updating speed in complex unknown environment and the dimensionality disaster is easy to occur, a path planning algorithm based on virtual subtarget for Q learning robot in unknown environment is proposed. According to the state trajectory explored by the mobile robot, two state chains are established to record the state-action pair and the state-reverse action pair respectively. The Q value of each single chain current state is fed back to the Q value of the previous state in turn till it affects the head of a single chain. Meanwhile, the problem that Q learning is prone to dimensionality disaster in large-scale environment is solved by finding the optimal virtual subtarget in the local detection domain. The experimental results show that the algorithm can effectively accelerate the convergence of the algorithm learning, improve the learning efficiency and complete the robot navigation task with a better path in the complex unknown environment.
Keywords:mobile robot  virtual subtarget  opposite Q learning  unknown environment  
本文献已被 万方数据 等数据库收录!
点击此处可从《广东工业大学学报》浏览原始摘要信息
点击此处可从《广东工业大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号