首页 | 本学科首页   官方微博 | 高级检索  
     


ADAPTIVE MODEL LEARNING BASED ON DYNA-Q LEARNING
Authors:Kao-Shing Hwang  Wei-Cheng Jiang  Yu-Jen Chen
Affiliation:1. Department of Electrical Engineering , National Sun Yat-sen University , Kaohsiung , Taiwan , R.O.C. hwang@ccu.edu.tw;3. Department of Electrical Engineering , National Sun Yat-sen University , Kaohsiung , Taiwan , R.O.C.
Abstract:Dyna-Q, a well-known model-based reinforcement learning (RL) method, interplays offline simulations and action executions to update Q functions. It creates a world model that predicts the feature values in the next state and the reward function of the domain directly from the data and uses the model to train Q functions to accelerate policy learning. In general, tabular methods are always used in Dyna-Q to establish the model, but a tabular model needs many more samples of experience to approximate the environment concisely. In this article, an adaptive model learning method based on tree structures is presented to enhance sampling efficiency in modeling the world model. The proposed method is to produce simulated experiences for indirect learning. Thus, the proposed agent has additional experience for updating the policy. The agent works backwards from collections of state transition and associated rewards, utilizing coarse coding to learn their definitions for the region of state space that tracks back to the precedent states. The proposed method estimates the reward and transition probabilities between states from past experience. Because the resultant tree is always concise and small, the agent can use value iteration to quickly estimate the Q-values of each action in the induced states and determine a policy. The effectiveness and generality of our method is further demonstrated in two numerical simulations. Two simulations, a mountain car and a mobile robot in a maze, are used to verify the proposed methods. The simulation result demonstrates that the training rate of our method can improve obviously.
Keywords:decision tree  Dyna-Q agent  model learning  reinforcement learning
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号