基于深度强化学习的移动机器人轨迹跟踪和动态避障 Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度强化学习的移动机器人轨迹跟踪和动态避障

引用本文：	吴运雄,曾碧.基于深度强化学习的移动机器人轨迹跟踪和动态避障[J].广东工业大学学报,2019,36(1):42-50.

作者姓名：	吴运雄曾碧

作者单位：	广东工业大学计算机学院,广东广州,510006;广东工业大学计算机学院,广东广州,510006

基金项目：	广东省自然科学基金资助项目（2016A030313713）；广东省应用型科技研发专项项目（2015B090922012）；广东省产学研合作专项项目（2014B090904080）

摘要：	针对移动机器人在局部可观测的非线性动态环境下，实现轨迹跟踪和动态避障时容易出错和不稳定的问题，提出了基于深度强化学习的视觉感知与决策方法.该方法以一种通用的形式将卷积神经网络的感知能力与强化学习的决策能力结合在一起，通过端对端的学习方式实现从环境的视觉感知输入到动作的直接输出控制，将系统环境感知与决策控制直接形成闭环，其中最优决策策略是通过最大化机器人与动力学环境交互的累计奖回报中学习获得.仿真实验结果证明，该方法可以满足多任务智能感知与决策要求，较好地解决了传统算法存在的容易陷入局部最优、在相近的障碍物群中震荡且不能识别路径、在狭窄通道中摆动以及障碍物附近目标不可达等问题，并且大大提高了机器人轨迹跟踪和动态避障的实时性和适应性.
关键词：	深度强化学习移动机器人轨迹跟踪动态避障
收稿时间：	2018-03-08
Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning

Wu Yun-xiong,Zeng Bi.Trajectory Tracking and Dynamic Obstacle Avoidance of Mobile Robot Based on Deep Reinforcement Learning[J].Journal of Guangdong University of Technology,2019,36(1):42-50.

Authors:	Wu Yun-xiong Zeng Bi

Affiliation:	School of Computers, Guangdong University of Technology, Guangzhou 510006, China

Abstract:	A method of visual perception and decision making based on deep reinforcement learning was proposed, to solve the problem of malfunction and instability in the trajectory tracking and dynamic obstacle avoidance of mobile robot in a partly observable nonlinear dynamic environment. This method was used in a general form to combine the perceptual ability of convolutional neural network (CNN) with the decision-making ability of reinforcement learning. The visual perception input of environment was transformed into the direct output control of actions by the way of end-to-end learning style, so that the system environment perception and decision-making control directly formed a closed loop. The optimal decision-making strategy was acquired from the maximization of interactive cumulative reward between robot and dynamic environment. The results of simulation experiment showed that this method could meet the requirements of multi-task intelligent perception and decision making, and well solve problems of the traditional algorithm such as easily falling into local optimum, vibrating and failing to identify the path among the similar obstacles, wavering in the narrow passage and failing to reach the targets near obstacle. It greatly improved the instantaneity and adaptability of robot trajectory tracking and dynamic obstacle avoidance.

Keywords:	deep reinforcement learning mobile robot trajectory tracking dynamic obstacle avoidance
本文献已被万方数据等数据库收录！
	点击此处可从《广东工业大学学报》浏览原始摘要信息
	点击此处可从《广东工业大学学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏