首页 | 本学科首页   官方微博 | 高级检索  
     


Intelligent proximal-policy-optimization-based decision-making system for humanoid robots
Affiliation:1. Department of Mechanical Engineering, National Chung Cheng University, Chiayi 62102, Taiwan;2. Advanced Institute of Manufacturing with High-tech Innovations (AIM-HI), National Chung Cheng University, Chiayi 62102, Taiwan;3. Department of Intelligent Robotics, National Pingtung University, Pingtung 90004, Taiwan;1. Graduate School of Culture Technology, KAIST, Daejeon, Republic of Korea;2. Mobility UX Research Section, Electronics and Telecommunications Research Institute, Daejeon, Republic of Korea.;1. College of Civil and Transportation Engineering, Shenzhen University, Shenzhen 518060, China;2. College of Management, Shenzhen University, Shenzhen 518073, China;3. Shenzhen International Maritime Institute, Shenzhen 518081, China;4. School of Intelligent Systems Engineering, Sun Yat-sen University, Guangzhou 510275, China;1. College of Economics and Management, East China Jiaotong University, Nanchang, China;2. School of Humanities, Jiangxi University of Finance and Economics, Nanchang, China;3. School of International Economics and Trade, Jiangxi University of Finance and Economics, Nanchang, China;4. Research Institute of Business Analytics and Supply Chain Management, College of Management, Shenzhen University, Shenzhen, China;1. School of Civil and Transportation Engineering, Hebei University of Technology, Tianjin, China;2. Department of Hydraulic Engineering, State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing, China
Abstract:With the advancements in technology, robots have gradually replaced humans in different aspects. Allowing robots to handle multiple situations simultaneously and perform different actions depending on the situation has since become a critical topic. Currently, training a robot to perform a designated action is considered an easy task. However, when a robot is required to perform actions in different environments, both resetting and retraining are required, which are time-consuming and inefficient. Therefore, allowing robots to autonomously identify their environment can significantly reduce the time consumed. How to employ machine learning algorithms to achieve autonomous robot learning has formed a research trend in current studies. In this study, to solve the aforementioned problem, a proximal policy optimization algorithm was used to allow a robot to conduct self-training and select an optimal gait pattern to reach its destination successfully. Multiple basic gait patterns were selected, and information-maximizing generative adversarial nets were used to generate gait patterns and allow the robot to choose from numerous gait patterns while walking. The experimental results indicated that, after self-learning, the robot successfully made different choices depending on the situation, verifying this approach’s feasibility.
Keywords:Humanoid robot  InfoGAN  Deep reinforcement learning  Decision making  Gait pattern generator
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号