首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Artificial Intelligence》2002,134(1-2):181-199
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results. Starting from random initial play, TD-Gammon's self-teaching methodology results in a surprisingly strong program: without lookahead, its positional judgement rivals that of human experts, and when combined with shallow lookahead, it reaches a level of play that surpasses even the best human players. The success of TD-Gammon has also been replicated by several other programmers; at least two other neural net programs also appear to be capable of superhuman play.Previous papers on TD-Gammon have focused on developing a scientific understanding of its reinforcement learning methodology. This paper views machine learning as a tool in a programmer's toolkit, and considers how it can be combined with other programming techniques to achieve and surpass world-class backgammon play. Particular emphasis is placed on programming shallow-depth search algorithms, and on TD-Gammon's doubling algorithm, which is described in print here for the first time.  相似文献   

2.
An experiment was conducted where neural networks compete for survival in an evolving population based on their ability to play checkers. More specifically, multilayer feedforward neural networks were used to evaluate alternative board positions and games were played using a minimax search strategy. At each generation, the extant neural networks were paired in competitions and selection was used to eliminate those that performed poorly relative to other networks. Offspring neural networks were created from the survivors using random variation of all weights and bias terms. After a series of 250 generations, the best-evolved neural network was played against human opponents in a series of 90 games on an Internet website. The neural network was able to defeat two expert-level players and played to a draw against a master. The final rating of the neural network placed it in the "Class A" category using a standard rating system. Of particular importance in the design of the experiment was the fact that no features beyond the piece differential were given to the neural networks as a priori knowledge. The process of evolution was able to extract all of the additional information required to play at this level of competency. It accomplished this based almost solely on the feedback offered in the final aggregated outcome of each game played (i.e., win, lose, or draw). This procedure stands in marked contrast to the typical artifice of explicitly injecting expert knowledge into a game-playing program.  相似文献   

3.
Here, we propose an evolutionary algorithm (i.e., evolutionary programming) for tuning the weights of a chess engine. Most of the previous work in this area has normally adopted co-evolution (i.e., tournaments among virtual players) to decide which players will pass to the following generation, depending on the outcome of each game. In contrast, our proposed method uses evolution to decide which virtual players will pass to the next generation based on the number of positions solved from a number of chess grandmaster games. Using a search depth of 1-ply, our method can solve 40.78% of the positions evaluated from chess grandmaster games (this value is higher than the one reported in the previous related work). Additionally, our method is capable of solving 53.08% of the positions using a historical mechanism that keeps a record of the “good” virtual players found during the evolutionary process. Our proposal has also been able to increase the competition level of our search engine, when playing against the program Chessmaster (grandmaster edition). Our chess engine reached a rating of 2404 points for the best virtual player with supervised learning, and a rating of 2442 points for the best virtual player with unsupervised learning. Finally, it is also worth mentioning that our results indicate that the piece material values obtained by our approach are similar to the values known from chess theory.  相似文献   

4.
A new competitive approach is developed for learning agents to play two-agent games. This approach uses particle swarm optimizers (PSO) to train neural networks to predict the desirability of states in the leaf nodes of a game tree. The new approach is applied to the TicTacToe game, and compared with the performance of an evolutionary approach. A performance criterion is defined to quantify performance against that of players making random moves. The results show that the new PSO-based approach performs well as compared with the evolutionary approach.  相似文献   

5.
针对合作行为的涌现与维持问题,基于演化博弈理论和网络理论,提出了一种促进合作的演化博弈模型。该模型同时将时间尺度、选择倾向性引入到演化博弈中。在初始化阶段,根据持有策略的时间尺度将个体分为两种类型:一种个体在每个时间步都进行策略更新;另一种个体在每一轮博弈后,以某种概率来决定是否进行策略更新。在策略更新阶段,模型用个体对周围邻居的贡献来表征他的声誉,并假设参与博弈的个体倾向于学习具有较好声誉邻居的策略。仿真实验结果表明,所提出的时间尺度与选择倾向性协同作用下的演化博弈模型中,合作行为能够在群体中维持;惰性个体的存在不利于合作的涌现,但是个体的非理性行为反而能够促进合作。  相似文献   

6.
Effect of Look-Ahead Depth in Evolutionary Checkers   总被引:1,自引:0,他引:1       下载免费PDF全文
It is intuitive that allowing a deeper search into a game tree will result in a superior player to one that is restricted in the depth of the search that it is allowed to make. Of course, searching deeper into the tree comes at increased computational cost and this is one of the trade-offs that has to be considered in developing a tree-based search algorithm. There has been some discussion as to whether the evaluation function, or the depth of the search, is the main contributory factor in the performance of an evolved checkers player. Some previous research has investigated this question (on Chess and Othello), with differing conclusions. This suggests that different games have different emphases, with respect to these two factors. This paper provides the evidence for evolutionary checkers, and shows that the look-ahead depth (like Chess, perhaps unsurprisingly) is important. This is the first time that such an intensive study has been carried out for evolutionary checkers and given the evidence provided for Chess and Othello this is an important study that provides the evidence for another game. We arrived at our conclusion by evolving various checkers players at different ply depths and by playing them against one another, again at different ply depths. This was combined with the two-move ballot (enabling more games against the evolved players to take place) which provides strong evidence that depth of the look-ahead is important for evolved checkers players.  相似文献   

7.
Jotto is a popular word game for two players. It is of interest here because it unquestionably requires some intellectual ability for people to play it well, and it is a game in which a simple program can beat most human players nearly all the time.  相似文献   

8.
A study was conducted to find out how game-playing strategies for Othello (also known as reversi) can be learned without expert knowledge. The approach used the coevolution of a fixed-architecture neural-network-based evaluation function combined with a standard minimax search algorithm. Comparisons between evolving neural networks and computer players that used deterministic strategies allowed evolution to be observed in real-time. Neural networks evolved to outperform the computer players playing at higher ply-depths, despite being handicapped by playing black and using minimax at ply-depth of two. In addition, the playing ability of the population progressed from novice, to intermediate, and then to master's level. Individual neural networks discovered various game-playing strategies, starting with positional and later mobility. These results show that neural networks can be evolved as evaluation functions, despite the general difficulties associated with this approach. Success in this case was due to a simple spatial preprocessing layer in the neural network that captured spatial information, self-adaptation of every weight and bias of the neural network, and a selection method that allowed a diverse population of neural networks to be carried forward from one generation to the next.  相似文献   

9.
This study explores how distributing the controls of a video game among multiple players affects the sociality and engagement experienced in game play. A video game was developed in which the distribution of game controls among the players could be varied, thereby affecting the abilities of the individual players to control the game. An experiment was set up in which eight groups of three players were asked to play the video game while the distribution of the game controls was increased in three steps. After each playing session, the players’ experiences of sociality and engagement were assessed using questionnaires. The results showed that distributing game control among the players increased the level of experienced sociality and reduced the level of experienced control. The game in which the controls were partly distributed led to the highest levels of experienced engagement, because the game allowed social play while still giving the players a sense of autonomy. The implications for interaction design are discussed.  相似文献   

10.
We conduct evolutionary programming experiments to evolve artificial neural networks for forecast combination. Using stock price volatility forecast data we find evolved networks compare favorably with a naive average combination, a least squares method, and a kernel method on out-of-sample forecasting ability-the best evolved network showed strong superiority in statistical tests of encompassing. Further, we find that the result is not sensitive to the nature of the randomness inherent in the evolutionary optimization process  相似文献   

11.
A new competitive coevolutionary team-based particle swarm optimiser (CCPSO(t)) algorithm is developed to train multi-agent teams from zero knowledge. The CCPSO(t) algorithm is applied to train a team of agents to play simple soccer. The algorithm uses the charged particle swarm optimiser in a competitive and cooperative coevolutionary training environment to train neural network controllers for the players. The CCPSO(t) algorithm makes use of the FIFA league ranking relative fitness function to gather detailed performance metrics from each game played. The training performance and convergence behaviour of the particle swarm are analysed. A hypothesis is presented that explains the lack of convergence in the particle swarms. After applying a clustering algorithm on the particle positions, a detailed visual and quantitative analysis of the player strategies is presented. The final results show that the CCPSO(t) algorithm is capable of evolving complex gameplay strategies for a complex non-deterministic game.  相似文献   

12.
Fullerton  T. 《Computer》2006,39(6):36-42
Major universities around the world now offer degree programs in game design in response to student demand; more than 80 such programs exist in North America alone. Recognizing the overwhelming interest in guidelines for teaching game design, the International Game Developers Association established a committee to help educators craft a curriculum that reflects the real-world creative process of professional game designers. The interactive media program at the USC School of Cinema-Television combines a broad liberal arts education with the technical expertise needed to create games that provide players with a narratively rich and emotionally immersive experience.  相似文献   

13.
Serious digital games may be an effective tool for prosocial message dissemination because they offer technology and experiences that encourage players to share them with others, and spread virally. But little is known about the factors that predict players’ willingness to share games with others in their social network. This panel study explores how several factors, including sharing technology use, emotional responses, and game enjoyment, contribute to players’ decision to share the game Darfur is Dying, with others. College students played the game and completed questionnaires that assessed whether they had shared the games at two different time points: during game play and after game play. Positive emotions predicted sharing while students played the game, but negative emotions predicted whether the game was shared after initial game play. Game enjoyment predicted players’ intentions to share the game, but it did not predict actual sharing behavior. Neither players’ general use of sharing technologies nor their satisfaction related to sharing digital content predicted sharing intentions or behavior. These findings have implications for the study of viral social marketing campaigns, and serious game design and theory.  相似文献   

14.
The N-player iterated prisoner's dilemma (NIPD) game has been widely used to study the evolution of cooperation in social, economic and biological systems. This paper studies the impact of different payoff functions and local interactions on the NIPD game. The evolutionary approach is used to evolve game-playing strategies starting from a population of random strategies. The different payoff functions used in our study describe different behaviors of cooperation and defection among a group of players. Local interaction introduces neighborhoods into the NIPD game. A player does not play against every other player in a group any more. He only interacts with his neighbors. We investigate the impact of neighborhood size on the evolution of cooperation in the NIPD game and the generalization ability of evolved strategies. Received 18 August 1999 / Revised 27 February 2000 / Accepted 15 May 2000  相似文献   

15.
Two learning methods for acquiring position evaluation for small Go boards are studied and compared. In each case the function to be learned is a position-weighted piece counter and only the learning method differs. The methods studied are temporal difference learning (TDL) using the self-play gradient-descent method and coevolutionary learning, using an evolution strategy. The two approaches are compared with the hope of gaining a greater insight into the problem of searching for "optimal" zero-sum game strategies. Using tuned standard setups for each algorithm, it was found that the temporal-difference method learned faster, and in most cases also achieved a higher level of play than coevolution, providing that the gradient descent step size was chosen suitably. The performance of the coevolution method was found to be sensitive to the design of the evolutionary algorithm in several respects. Given the right configuration, however, coevolution achieved a higher level of play than TDL. Self-play results in optimal play against a copy of itself. A self-play player will prefer moves from which it is unlikely to lose even when it occasionally makes random exploratory moves. An evolutionary player forced to perform exploratory moves in the same way can achieve superior strategies to those acquired through self-play alone. The reason for this is that the evolutionary player is exposed to more varied game-play, because it plays against a diverse population of players.  相似文献   

16.
17.
This study applies social capital theory to investigate how a player’s network centrality in an online gaming community (i.e., a guild) affects his/her attitude and continuance intention toward a Massive Multiplayer Online Game (MMOG). Analysis of 347 usable responses shows that players’ network centrality has a negative impact on their ties to players who belong to other guilds (i.e., non-guild interaction), but a positive effect on players’ access to resources. However, players’ network centrality fails to increase their perceived game enjoyment directly. Players’ resource accessibility and perceived game enjoyment play mediating roles in the relationship between network centrality and attitude toward playing an MMOG, which in turn influences game continuance intention. The results also show that although players’ non-guild interaction is negatively related to their resource accessibility from the networks, it is positively associated with perceived game enjoyment. The article concludes with implications and limitations of the study.  相似文献   

18.
This article focuses on the techniques of evolutionary computation for generating players performing tasks cooperatively. However, in using evolutionary computation for generating players performing tasks cooperatively, one faces fundamental and difficult decisions, including the one regarding the so-called credit assignment problem. We believe that there are some correlations among design decisions, and therefore a comprehensive evaluation of them is essential. We first list three fundamental decisions and possible options in each decision in designing methods for evolving a cooperative team. We find that there are 18 typical combinations available. Then we describe the ultimately simplified soccer game played on a one-dimensional field as a testbed for a comprehensive evaluation for these 18 candidate methods. It has been shown that some methods perform well, while there are complex correlations among design decisions. Also, further analysis has shown that cooperative behavior can be evolved, and is a necessary requirement for the teams to perform well even in such a simple game. This work was presented in part at the 10th International Symposium on Artificial Life and Robotics, Oita, Japan, February 4–6, 2005  相似文献   

19.
即时战略游戏(简称RTS游戏)中,用户的行为由于游戏自身庞大的决策空间而难以预测.针对这个问题,提出了通过对RTS游戏的对战记录进行分析,建立5种结构的神经网络模型来预测用户行为的方法.模型考虑了不同时间片的状态对于决策行为的影响,设计了单时间片输入和双时间片输入的神经网络,并与基于动态贝叶斯网络的模型进行了比较.实验结果表明,基于单时间片输入的神经网络模型能够更加快速地完成训练过程并达到满意的预测准确度.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号