首页 | 本学科首页   官方微博 | 高级检索  
     

深度强化学习中稀疏奖励问题研究综述
引用本文:杨惟轶,白辰甲,蔡超,赵英男,刘鹏. 深度强化学习中稀疏奖励问题研究综述[J]. 计算机科学, 2020, 47(3): 182-191
作者姓名:杨惟轶  白辰甲  蔡超  赵英男  刘鹏
作者单位:中国联通网络技术研究院 北京 100048;哈尔滨工业大学计算机科学与技术学院 哈尔滨 150001
摘    要:强化学习作为机器学习的重要分支,是在与环境交互中寻找最优策略的一类方法。强化学习近年来与深度学习进行了广泛结合,形成了深度强化学习的研究领域。作为一种崭新的机器学习方法,深度强化学习同时具有感知复杂输入和求解最优策略的能力,可以应用于机器人控制等复杂决策问题。稀疏奖励问题是深度强化学习在解决任务中面临的核心问题,在实际应用中广泛存在。解决稀疏奖励问题有利于提升样本的利用效率,提高最优策略的水平,推动深度强化学习在实际任务中的广泛应用。文中首先对深度强化学习的核心算法进行阐述;然后介绍稀疏奖励问题的5种解决方案,包括奖励设计与学习、经验回放机制、探索与利用、多目标学习和辅助任务等;最后对相关研究工作进行总结和展望。

关 键 词:深度强化学习  深度学习  强化学习  稀疏奖励  人工智能

Survey on Sparse Reward in Deep Reinforcement Learning
YANG Wei-yi,BAI Chen-jia,CAI Chao,ZHAO Ying-nan,LIU Peng. Survey on Sparse Reward in Deep Reinforcement Learning[J]. Computer Science, 2020, 47(3): 182-191
Authors:YANG Wei-yi  BAI Chen-jia  CAI Chao  ZHAO Ying-nan  LIU Peng
Affiliation:(China Unicom Network Technology Research Institute,Beijing 100048,China;School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China)
Abstract:As an important research direction of machine learning,reinforcement learning is a kind of method of finding out the optimal policy by interacting with the environment.In recent years,deep learning is widely used in reinforcement learning algorithm,forming a new research field named deep reinforcement learning.As a new machine learning method,deep reinforcement learning has the ability to perceive complex inputs and solve optimal policies.It is applied to robot control and complex decision-making problems.The sparse reward problem is the core problem of reinforcement learning in solving practical tasks.Sparse reward problem exists widely in practical applications.Solving the sparse reward problem is conducive to improving the sample-efficiency and the quality of optimal policy,and promoting the application of deep reinforcement learning to practical tasks.Firstly,an overview of the core algorithm of deep reinforcement learning was given.Then five solutions of sparse reward problem were introduced,including reward design and learning,experience replay,exploration and exploitation,multi-goal learning and auxiliary tasks.Finally,the related researches were summarized and prospected.
Keywords:Deep reinforcement learning  Deep learning  Reinforcement learning  Sparse reward  Artificial intelligence
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号