首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper considers the problem of designing novel techniques for multi-player game playing, in a range of board games and configurations. Compared to the well-known case of two-player game playing, multi-player game playing is a more complex problem with unique requirements. To address the unique challenges of this domain, we examine the potential of employing techniques inspired by Adaptive Data Structures (ADSs) to rank opponents based on their relative threats, and using this information to achieve gains in move ordering and tree pruning. We name our new technique the Threat-ADS heuristic. We examine the Threat-ADS’ performance within a range of game models, employing a number of different, well-understood update mechanisms for ADSs. We then extend our analysis to specifically consider intermediate board states, which are more interesting than the initial board state, as we do not assume the availability of “Opening book” moves, and where substantial variation can exist, in terms of available moves and threatening opponents. We expand this analysis to include an exploration of the Threat-ADS heuristic’s performance in deeper ply game trees, to confirm that it maintains its benefits even when lookahead is greater, and with an expanded examination of how the number of players present in the game impacts the performance of the Threat-ADS heuristic. We find that in nearly all environments, the Threat-ADS heuristic is able to produce meaningful, statistically significant improvements in tree pruning, demonstrating that it serves as a very reliable move ordering heuristic for multi-player game playing under a wide range of configurations, thus motivating the use of ADS-based techniques within the field of game playing.  相似文献   

2.
Abstract: Two methods of genetic evolution of linear and non-linear heuristic evaluation functions for the game of checkers and give-away checkers are presented in the paper. The first method is based on the simplistic assumption that a relation 'close' to partial order can be defined over the set of evaluation functions. Hence an explicit fitness function is not necessary in this case and direct comparison between heuristics (a tournament) can be used instead. In the other approach a heuristic is developed step-by-step based on the set of training games. First, the end-game positions are considered and then the method gradually moves 'backwards' in the game tree up to the starting position and at each step the best fitted specimen from the previous step (previous game tree depth) is used as the heuristic evaluation function in the alpha-beta search for the current step. Experimental results confirm that both approaches lead to quite strong heuristics and give hope that a more sophisticated and more problem-oriented evolutionary process might ultimately provide heuristics of quality comparable to those of commercial programs.  相似文献   

3.
以alpha-beta剪枝算法为研究对象,提出一种基于alpha-beta剪枝和概率剪枝因素相结合的概率剪枝算法,来解决博弈树搜索问题。利用概率剪枝算法,可减少博弈树搜索深度,从而加快搜索进程。  相似文献   

4.
以alpha—beta剪枝算法为研究对象,提出一种基于alpha—beta剪枝和概率剪枝因素相结合的概率剪枝算法.来解决博弈树搜索问题。利用概率剪枝算法,可减少博弈树搜索深度,从而加快搜索进程。  相似文献   

5.
Temporal-difference learning is one of the most successful and broadly applied solutions to the reinforcement learning problem; it has been used to achieve master-level play in chess, checkers and backgammon. The key idea is to update a value function from episodes of real experience, by bootstrapping from future value estimates, and using value function approximation to generalise between related states. Monte-Carlo tree search is a recent algorithm for high-performance search, which has been used to achieve master-level play in Go. The key idea is to use the mean outcome of simulated episodes of experience to evaluate each state in a search tree. We introduce a new approach to high-performance search in Markov decision processes and two-player games. Our method, temporal-difference search, combines temporal-difference learning with simulation-based search. Like Monte-Carlo tree search, the value function is updated from simulated experience; but like temporal-difference learning, it uses value function approximation and bootstrapping to efficiently generalise between related states. We apply temporal-difference search to the game of 9×9 Go, using a million binary features matching simple patterns of stones. Without any explicit search tree, our approach outperformed an unenhanced Monte-Carlo tree search with the same number of simulations. When combined with a simple alpha-beta search, our program also outperformed all traditional (pre-Monte-Carlo) search and machine learning programs on the 9×9 Computer Go Server.  相似文献   

6.
In this paper, we consider the problem of finding good next moves in two-player games. Traditional search algorithms, such as minimax and alpha-beta pruning, suffer great temporal and spatial expansion when exploring deeply into search trees to find better next moves. The evolution of genetic algorithms with the ability to find global or near global optima in limited time seems promising, but they are inept at finding compound optima, such as the minimax in a game-search tree. We thus propose a new genetic algorithm-based approach that can find a good next move by reserving the board evaluation values of new offspring in a partial game-search tree. Experiments show that solution accuracy and search speed are greatly improved by our algorithm.  相似文献   

7.
Minimax search algorithms with and without aspiration windows   总被引:1,自引:0,他引:1  
Investigation of several algorithms for computing exact minimax values of game trees (utilizing backward pruning) are discussed. The focus is on trees with an ordering similar to that actually found in game playing practice. The authors compare the algorithms using two different distributions of the static values, the uniform distribution and a distribution estimated from practical data. A systematic comparison of using aspiration windows for all of the usual minimax algorithms is presented. The effects of aspiration windows of varying size and position are analyzed. Increasing the ordering of moves to near the optimum results in unexpectedly high savings. Algorithms with linear space complexity benefit most. Although the ordering of the first move is of predominant importance, that of the remainder has only second-order effects. The use of an aspiration window not only makes alpha-beta search competitive, but there also exist dependencies of its effects on certain properties of the trees  相似文献   

8.
In this paper we investigate the ``mandatory-work-first' approach to parallel alpha-beta search first proposed by Akl, Barnard, and Doran. This approach is based on a version of alpha-beta search without deep cutoffs and a two-stage evaluation process, the second stage of which is often pruned. Our analysis shows that for best-first ordering on the lookahead tree, this approach provides greater speedup than the Palphabeta tree-splitting technique, and that for worst-first ordering, mandatory work first provides only slightly worse speedup than Palphabeta.  相似文献   

9.
季辉  丁泽军 《计算机科学》2018,45(1):140-143
蒙特卡洛树搜索(MCTS)是一种针对决策类博弈游戏,运用蒙特卡洛模拟方法进行评估博弈策略的启发式搜索算法。但是,在面对计算机围棋这种复杂的决策过程时,简单的蒙特卡洛树搜索过程往往由于计算量大,收敛速度非常慢。 由于双人博弈游戏中的蒙特卡洛树搜索不能收敛于双人博弈的最佳决策策略,因此提出蒙特卡洛树搜索结合极大极小值算法的改进算法,使得搜索结果不会因为蒙特卡洛方法的随机性而失真。为了进一步提高复杂双人博弈游戏中搜索算法的计算效率,还结合了几种常见的剪枝策略。实验结果说明,所提算法显著改进了蒙特卡洛树搜索的准确性和效率。  相似文献   

10.
博弈树搜索对于计算机博弈至关重要。优秀的搜索算法通过搜索较少的节点就可以获得最佳路径,从而提高计算机的博弈水平。论文以中国象棋计算机博弈作为背景,在alpha-beta基本搜索算法上,详细阐述了置换表启发算法的原理和哈希冲突,引进了双层置换表的概念及其替换策略,增强了引擎的搜索效率。实验结果表明了该算法的有效性。  相似文献   

11.
Stochastic Frontier Regression Analysis was used to investigate strategies and skills that are associated with the minimization of time required to achieve proficiency in video games among students in grades four and five. Students self-reported their video game play habits, including strategies and skills used to become good at the video games they play. Results indicated an association between game play time spent during vacation weeks and proficiency at the game, but no such association existed with game play time during typical weeks when school is in session. Several strategies and skills were associated with the minimization of time spent to achieve proficiency at the game, while a few strategies and skills held a negative association with efficient learning in games. Some of the findings paralleled those of prior research on formal education. Gender differences, as well as implications for games and learning are discussed.  相似文献   

12.
基于深度学习模型的有监督训练依赖于大量高质量标定数据,但众多小众计算机博弈比赛棋种,存在缺少人类对局记录作为训练样本的问题,因此在使用深度学习模型前如何生成一个合理标定的局面数据集是值得研究探讨的问题。针对点格棋博弈问题,提出了一种数据哈希去重以及局面标定方法。根据不同阶段回合局面数据的特点,通过alpha-beta完全搜索、回溯标定、并行化MCTS算法标定以及对称扩展技巧,收集并标定不同回合数的点格棋局面样本。实验共获得了包含15 000 000个带标定点格棋局面样本的数据集,为基于深度学习模型的点格棋有监督训练提供了保障。此外,所提方法也为其他棋种训练数据的获取提供有价值的借鉴。  相似文献   

13.
The purpose of this research is to investigate the extent to which knowledge can replace and support search in selecting a chess move and to delineate the issues involved. This has been carried out by constructing a program, PARADISE (PAttern Recognition Applied to DIrecting SEarch), which finds the best move in tactically sharp middle game positions from the games of chess masters. It encodes a large body of knowledge in the form of production rules. The actions of the rules post concepts in the data base while the conditions match patterns in the chess position and data base. The program uses the knowledge base to discover plans during static analysis and to guide a small tree search which confirms that a particular plan is best. The search is “small” in the sense that the size of the search tree is of the same order of magnitude as a human master's search tree (tens and hundreds of nodes, not thousands to hundreds of thousands as in many computer chess programs).Once a plan is formulated, it guides the tree search for several ply and expensive static analyses (needed to analyze a new position) are done infrequently. PARADISE avoids placing a depth limit on the search (or any other artificial effort limit). By using a global view of the search tree, information gathered during the search, and the analysis provided by the knowledge base, the program produces enough terminations to force convergence of the search. PARADISE has found combinations as deep as 19 ply.  相似文献   

14.
Search procedures, such as alpha-beta and SSS1, are used to solve minimax game trees. With a notable exception of B1, most of these procedures assume the static model, i.e., the computation is done solely on the basis of static values given to terminal nodes. The first goal of this paper is to generalize these to the informed model, which permits the usage of heuristic information pertaining to nonterminal nodes, such as upper and lower bounds, and estimates, of the exact values realizable from the corresponding game positions. We provide a general framework, within which various conventional procedures including alpha-beta and SSS1 can be naturally generalized to the informed model.For the static model, it is known that SSS1 surpasses alpha-beta in the sense that it explores only a subset of the nodes which are explored by alpha-beta. The second goal of this paper is, assuming the informed model, to develop a precise characterization of the class of search procedures that surpass alpha-beta. It turns out that the class contains many search procedures other than SSS1 (even for the static model). Finally some computational comparison among these search procedures is made by solving the 4 × 4 Othello game.  相似文献   

15.
We model security protocols as games using concepts of game semantics. Using this model we ascribe semantics to protocols written in the standard simple arrow notation. According to the semantics, a protocol is interpreted as a set of strategies over a game tree that represents the type of the protocol. The model uses abstract computation functions and message frames in order to model internal computations and knowledge of agents and the intruder. Moreover, in order to specify properties of the model, a logic that deals with games and strategies is developed. A tableau-based proof system is given for the logic, which can serve as a basis for a model checking algorithm. This approach allows us to model a wide range of security protocol types and verify different properties instead of using a variety of methods as is currently the practice. Furthermore, the analyzed protocols are specified using only the simple arrow notation heavily used by protocol designers and by practitioners.  相似文献   

16.
The notion of optimality naturally arises in many areas of applied mathematics and computer science concerned with decision making. Here we consider this notion in the context of three formalisms used for different purposes in reasoning about multi-agent systems: strategic games, CP-nets, and soft constraints. To relate the notions of optimality in these formalisms we introduce a natural qualitative modification of the notion of a strategic game. We show then that the optimal outcomes of a CP-net are exactly the Nash equilibria of such games. This allows us to use the techniques of game theory to search for optimal outcomes of CP-nets and vice-versa, to use techniques developed for CP-nets to search for Nash equilibria of the considered games. Then, we relate the notion of optimality used in the area of soft constraints to that used in a generalization of strategic games, called graphical games. In particular we prove that for a natural class of soft constraints that includes weighted constraints every optimal solution is both a Nash equilibrium and Pareto efficient joint strategy. For a natural mapping in the other direction we show that Pareto efficient joint strategies coincide with the optimal solutions of soft constraints.   相似文献   

17.
博弈是启发式搜索的一个重要应用领域,博弈的过程可以用一棵博弈搜索树表示,通过对博弈树进行搜索求取问题的解,搜索策略常采用α-β剪枝技术。在深入研究α-β剪枝技术的基础上,提出在扩展未达到规定深度节点时,对扩展出的子节点按照估价函数大小顺序插入到搜索树中,从而在α-β剪枝过程中剪掉更多的分枝,提高搜索效率。  相似文献   

18.
围棋机器博弈是机器博弈中重要的分支之一,其庞大的博弈空间给机器博弈研究者带来了巨大挑战.目前围棋机器博弈多采用静态估值搜索与蒙特卡洛树搜索,故将时间差分算法引入至九路围棋机器博弈系统中,提出基于时间差分算法的围棋机器博弈系统模型,该博弈系统具有一定的自学习能力,能在不断的对弈中逐步提高博弈能力.通过与采用α-β搜索算法的博弈系统进行实际对弈,证明了该方法的可行性.  相似文献   

19.
Kayles is a nonpartizan two-person game. In such games the moves available to both the players are the same and either the player on move wins or the previous player wins. In the “normal form” of the game, the last player who can make a move is declared the winner. In the “misere form” of the game, the player who makes the last move is declared the loser. The complete analysis of the normal form of the game of Kayles has been described by Berlekamp, Conway, and Guy. This paper completes their analysis of the misere form of Kayles. This is done on the basis of certain properties of what have been called “wild” positions of misere games.  相似文献   

20.
《Artificial Intelligence》1987,31(2):185-199
Of the many minimax algorithms, sss1 is noteworthy because it usually searches the smallest game trees. Its success can be attributed to the accumulation and use of information acquired while traversing the tree. The main disadvantages of sss1 are its high storage needs and management costs. This paper describes a class of methods, based on the popular alpha-beta algorithm, that acquire and use information to guide a tree search. They retain a given search direction and yet are as good as sss1, even while searching random trees. Further, although some of these new algorithms also require substantial storage, they are more flexible and can be programmed to use only the space available, at the cost of some degradation in performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号