首页 | 本学科首页   官方微博 | 高级检索  
     


Development of a reinforcement learning system to play Othello
Authors:Kazuteru?Miyazaki  author-information"  >  author-information__contact u-icon-before"  >  mailto:teru@niad.ac.jp"   title="  teru@niad.ac.jp"   itemprop="  email"   data-track="  click"   data-track-action="  Email author"   data-track-label="  "  >Email author,Sougo?Tsuboi,Shigenobu?Kobayashi
Affiliation:(1) National Institution for Academic Degrees and University Evaluation, 1-29-1 Gakuen-nishimachi, Kodaira, 187-8587 Tokyo, Japan;(2) Toshiba, Kawasaki, Kanagawa, Japan;(3) Tokyo Institute of Technology, Yokohama, Kanagawa, Japan
Abstract:The purpose of the reinforcement learning system is to learn an optimal policy in general. On the other hand, in two-player games such as Othello, it is important to acquire a penalty-avoiding policy that can avoid losing the game. We know the penalty avoiding rational policy making algorithm (PARP) to learn the policy. If we apply PARP to large-scale problems, we are confronted with an explosion of the number of states. In this article, we focus on Othello, a game that has huge state spaces. We introduce several ideas and heuristics to adapt PARP to Othello. We show that our learning player beats the well-known Othello program, KITTY. This work was presented, in part, at the 7th International Symposium on Artificial Life and Robotics, Oita, Japan, January 16–18, 2002
Keywords:Reinforcement learning  Reward and penalty  Othello  KITTY
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号