首页 | 本学科首页   官方微博 | 高级检索  
     


Co-Evolution in the Successful Learning of Backgammon Strategy
Authors:Pollack  Jordan B.  Blair  Alan D.
Affiliation:(1) Computer Science Department, Volen Center for Complex Systems, Brandeis University, Waltham, MA, 02254. E-mail: Email
Abstract:Following Tesauro's work on TD-Gammon, we used a 4,000 parameter feedforward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and selection of the position with the highest evaluation. However, no backpropagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hillclimbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a ldquometa-gamerdquo of self-learning.
Keywords:coevolution  backgammon  reinforcement  temporal difference learning  self-learning
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号