一种提升机器人强化学习开发效率的训练模式研究 A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种提升机器人强化学习开发效率的训练模式研究

引用本文：	叶伟杰,高军礼,蒋丰,郭靖. 一种提升机器人强化学习开发效率的训练模式研究[J]. 广东工业大学学报, 2020, 37(5): 46-50. DOI: 10.12052/gdutxb.200009

作者姓名：	叶伟杰高军礼蒋丰郭靖

作者单位：	广东工业大学自动化学院，广东广州 510006

基金项目：	国家自然科学基金资助项目(61803103)；国家留学基金资助项目(201908440537)

摘要：	强化学习与深度学习结合的深度强化学习（Deep Reinforcement Learning，DRL）模型，目前被广泛应用于机器人控制领域。机器人强化学习需要在3D仿真环境中训练模型，然而在缺乏环境先验知识的情况下，在3D环境中进行试错学习会导致训练周期长、开发成本高的问题。因此提出一种贯通2D到3D的机器人强化学习训练模式，将计算量大、耗时多的工作部署到2D环境中，再把算法结果迁移到3D环境中进行测试。实验证明，这种训练模式能使基于个人电脑的机器人强化学习的开发效率提升5倍左右。
关键词：	深度强化学习机器人控制训练模式开发效率
收稿时间：	2020-01-09
A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning

Ye Wei-jie,Gao Jun-li,Jiang Feng,Guo Jing. A Research on a Training Model to Improve the Development Efficiency of Robot Reinforcement Learning[J]. Journal of Guangdong University of Technology, 2020, 37(5): 46-50. DOI: 10.12052/gdutxb.200009

Authors:	Ye Wei-jie Gao Jun-li Jiang Feng Guo Jing

Affiliation:	School of Automation, Guangdong University of Technology, Guangzhou 510006, China

Abstract:	Deep reinforcement learning (DRL) model combining reinforcement learning and deep learning is currently widely used in the field of robot control. Robot reinforcement learning needs to train the model in a 3D simulation environment. However, in the absence of prior environmental knowledge, trial and error learning in a 3D environment leads to long training cycles and high development costs. To solve this problem, a training mode from 2D to 3D is proposed. Time-consuming and computationally intensive work is completed in a 2D environment, and the results are transferred to a 3D environment for testing. Experiments show that this training mode can improve the development efficiency by about five times, so that personal computers can also do research related to robot reinforcement learning.

Keywords:	deep reinforcement learning robot control training mode development efficiency

	点击此处可从《广东工业大学学报》浏览原始摘要信息
	点击此处可从《广东工业大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏