首页 | 本学科首页   官方微博 | 高级检索  
     


Learning to Play Chess Using Temporal Differences
Authors:Baxter  Jonathan  Tridgell  Andrew  Weaver  Lex
Affiliation:(1) Department of Systems Engineering, Australian National University, 0200, Australia;(2) Department of Computer Science, Australian National University, 0200, Australia;(3) Department of Computer Science, Australian National University, 0200, Australia
Abstract:In this paper we present TDLEAF(lambda), a variation on the TD(lambda) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our chess program ldquoKnightCaprdquo used TDLEAF(lambda) to learn its evaluation function while playing on Internet chess servers. The main success we report is that KnightCap improved from a 1650 rating to a 2150 rating in just 308 games and 3 days of play. As a reference, a rating of 1650 corresponds to about level B human play (on a scale from E (1000) to A (1800)), while 2150 is human master level. We discuss some of the reasons for this success, principle among them being the use of on-line, rather than self-play. We also investigate whether TDLEAF(lambda) can yield better results in the domain of backgammon, where TD(lambda) has previously yielded striking success.
Keywords:temporal difference learning  neural network  TDLEAF  chess  backgammon
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号