基于深度强化学习的双足机器人斜坡步态控制方法 A Gait Control Method for Biped Robot on Slope Based on Deep Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度强化学习的双足机器人斜坡步态控制方法

引用本文：	吴晓光,刘绍维,杨磊,邓文强,贾哲恒.基于深度强化学习的双足机器人斜坡步态控制方法[J].自动化学报,2021,47(8):1976-1987.

作者姓名：	吴晓光刘绍维杨磊邓文强贾哲恒

作者单位：	1.燕山大学电气工程学院秦皇岛 066004

基金项目：	国家自然科学基金(61503325), 中国博士后科学基金(2015M581316)资助

摘要：	为提高准被动双足机器人斜坡步行稳定性, 本文提出了一种基于深度强化学习的准被动双足机器人步态控制方法. 通过分析准被动双足机器人的混合动力学模型与稳定行走过程, 建立了状态空间、动作空间、episode过程与奖励函数. 在利用基于DDPG改进的Ape-X DPG算法持续学习后, 准被动双足机器人能在较大斜坡范围内实现稳定行走. 仿真实验表明, Ape-X DPG无论是学习能力还是收敛速度均优于基于PER的DDPG. 同时, 相较于能量成型控制, 使用Ape-X DPG的准被动双足机器人步态收敛更迅速、步态收敛域更大, 证明Ape-X DPG可有效提高准被动双足机器人的步行稳定性.
关键词：	准被动双足机器人深度强化学习步态控制步行稳定性
收稿时间：	2019-07-23
A Gait Control Method for Biped Robot on Slope Based on Deep Reinforcement Learning

Affiliation:	1.School of Electrical Engineering, Yanshan University, Qinhuangdao 066004

Abstract:	In order to improve the walking stability on slope of the quasi-passive biped robot, in this paper, we proposed a gait control method for quasi-passive biped robot based on deep reinforcement learning. By analyzing the hybrid dynamics model and the stable walking process of the quasi-passive biped robot establishing the state space, action space, episode process and reward function. After learning by Ape-X DPG algorithm based on DDPG improvement, quasi-passive biped robot can achieve stable walking in a larger slope range. In the simulation, Ape-X DPG is better than DDPG + PER in both learning ability and convergence speed. Meanwhile, compared with energy shaping controller, the gait convergence of quasi-passive biped robot using Ape-X DPG is more rapid and the basins of attraction is larger, which proves that Ape-X DPG can effectively improve the walking stability of quasi-passive biped robot.

Keywords:

	点击此处可从《自动化学报》浏览原始摘要信息
	点击此处可从《自动化学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏