首页 | 本学科首页   官方微博 | 高级检索  
     

基于D-DQN强化学习算法的双足机器人智能控制研究
引用本文:李丽霞,陈艳.基于D-DQN强化学习算法的双足机器人智能控制研究[J].计算机测量与控制,2024,32(3):181-187.
作者姓名:李丽霞  陈艳
作者单位:广州华商学院
基金项目:2022年度广州华商学院高等教育教学改革项目(HS2022ZLGC71)
摘    要:针对现有双足机器人智能控制算法存在的轨迹偏差大、效率低等问题,提出了一种基于D-DQN强化学习的控制算法;先分析双足机器人运动中的坐标变换关系和关节连杆补偿过程,然后基于Q值网络实现对复杂运动非线性过程降维处理,采用了Q值网络权值和辅助权值的双网络权值设计方式,进一步强化DQN网络性能,并以Tanh函数作为神经网络的激活函数,提升DQN网络的数值训练能力;在数据训练和交互中经验回放池发挥出关键的辅助作用,通过将奖励值输入到目标函数中,进一步提升对双足机器人的控制精度,最后通过虚拟约束控制的方式提高双足机器人运动中的稳定性;实验结果显示:在D-DQN强化学习的控制算法,机器人完成第一阶段测试的时间仅为115 s,综合轨迹偏差0.02 m,而且步态切换极限环测试的稳定性良好。

关 键 词:D-DQN  强化学习  双足机器人  智能控制  经验回放池  虚拟约束控制
收稿时间:2023/8/22 0:00:00
修稿时间:2023/9/8 0:00:00

Research on Intelligent Control of Biped Robot Based on D-DQN Reinforcement Learning Algorithm
Abstract:Aiming at the problems of large trajectory deviation and low efficiency of existing intelligent control algorithms for biped robots, a control algorithm based on D-DQN reinforcement learning is proposed. Firstly, the coordinate transformation relationship in the motion of biped robot and the compensation process of joint and link are analyzed, and then the dimensionality reduction of complex nonlinear motion process is realized based on Q-value network. The double weight design method of Q-value network weight and auxiliary weight is adopted to strengthen the performance of DQN network, and Tanh function is used as the activation function of neural network to improve the numerical training ability of DQN network. The experience playback pool plays a key auxiliary role in data training and interaction. By inputting the reward value into the objective function, the control accuracy of the biped robot is further improved. Finally, the stability of the biped robot is improved by virtual constraint control. The experimental results show that under the D-DQN reinforcement learning control algorithm, the time of the robot to complete the first stage test is only 115s, the comprehensive trajectory deviation is 0.02m, and the stability of the gait switching limit cycle test is good.
Keywords:D-DQN  Reinforcement learning  Bipedal robot  Intelligent control  Experience playback pool  Virtual constraint control
点击此处可从《计算机测量与控制》浏览原始摘要信息
点击此处可从《计算机测量与控制》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号