首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 203 毫秒
1.
蒙特卡洛树搜索算法是一种常用的强化学习算法,博弈过程中动态空间的指数级增长是制约该算法学习效率的因素。基于并行方法对蒙特卡洛树搜索算法进行优化,提出基于胜率估值传递的并行蒙特卡洛树搜索算法。改进后的并行博弈搜索策略框架包含一个主进程和多个子进程,其中子进程用于探索,主进程根据子进程传递的胜率估值数据进行决策。结合多智能体博弈平台Pommerman进行实验验证,与传统的蒙特卡罗树搜索算法相比,并行蒙特卡罗树搜索算法有效提高了资源利用率、博弈胜率及决策效率。   相似文献   

2.
雷捷维  王嘉旸  任航  闫天伟  黄伟 《计算机工程》2021,47(3):304-310,320
麻将作为典型的非完备信息博弈游戏主要通过传统Expectimax搜索算法实现,其剪枝策略与估值函数基于人工先验知识设计,存在假设不合理等问题。提出一种结合Expectimax搜索与Double DQN强化学习算法的非完备信息博弈算法。在Expectimax搜索树扩展过程中,采用Double DQN输出的估值设计估值函数并在限定搜索层数内获得分支估值,同时设计剪枝策略对打牌动作进行排序与部分扩展实现搜索树剪枝。在Double DQN模型训练过程中,将麻将信息编码为特征数据输入神经网络获得估值,使用Expectimax搜索算法得到最优动作以改进探索策略。实验结果表明,与Expectimax搜索算法、Double DQN算法等监督学习算法相比,该算法在麻将游戏上胜率与得分更高,具有更优异的博弈性能。  相似文献   

3.
研究了一种新的进化算法-和声搜索(HS)算法,针对其在处理复杂函数优化问题时容易陷入局部最优、收敛精度低的缺点,提出一种改进的和声搜索算法,算法在保留和声搜索的搜索机理的同时,把混合蛙跳算法中的局部搜索策略引入其中,维持了和声库的多样性,从而提高了对复杂问题的搜索效率.与同类算法相比,本文提出的和声搜索算法全局搜索能力强,收敛速度快,数值实验结果验证了算法的有效性和鲁棒性.  相似文献   

4.
杜秀全  程家兴 《微机发展》2007,17(1):216-218
计算机博弈是一种对策性游戏,是人工智能的主要研究领域之一,它涉及人工智能中的搜索方法、推理技术和决策规划等。目前广泛研究的是确定的、二人、零和、完备信息的博弈搜索。文中通过一个黑白棋程序的设计,将生成的博弈树节点的估值过程和对博弈树搜索过程相结合,采用传统的Alpha-Beta剪枝和极大-极小原则方法给出了博弈程序设计的核心内容:包括博弈树搜索和估值函数两个方面,提出了对原算法的一种改进,该算法提高了搜索速度。实验结果验证了算法的有效性。  相似文献   

5.
博弈算法在黑白棋中的应用   总被引:1,自引:0,他引:1  
计算机博弈是一种对策性游戏,是人工智能的主要研究领域之一.它涉及人工智能中的搜索方法、推理技术和决策规划等。目前广泛研究的是确定的、二人、零和、完备信息的博弈搜索。文中通过一个黑白棋程序的设计,将生成的博弈树节点的估值过程和对博弈树搜索过程相结合,采用传统的Alpha—Beta剪枝和极大一极小原则方法给出了博弈程序设计的核心内容:包括博弈树搜索和估值函数两个方面,提出了对原算法的一种改进,该算法提高了搜索速度。实验结果验证了算法的有效性。  相似文献   

6.
张越  芦东昕 《微机发展》2007,17(3):102-105
博弈是人工智能研究的重要分支,它涉及人工智能中的推理技术、搜索方法和决策规划。而搜索策略是博弈问题的关键。针对搜索技术中存在的由于搜索空间巨大而引起的搜索效率下降的缺点,结合五子棋的特点,探讨了相应博弈问题的求解策略,提出一种结合PVS算法、静态着法启发、历史启发算法的搜索策略。实验结果证明,该算法不但能保证博弈水平,还能得到较好的搜索效率。  相似文献   

7.
博弈是人工智能研究的重要分支,它涉及人工智能中的推理技术、搜索方法和决策规划。而搜索策略是博弈问题的关键。针对搜索技术中存在的由于搜索空间巨大而引起的搜索效率下降的缺点,结合五子棋的特点,探讨了相应博弈问题的求解策略,提出一种结合PVS、静态着法启发、历史启发算法的搜索策略。实验结果证明,该算法不但能保证博弈水平,还能得到较好的搜索效率。  相似文献   

8.
博弈树搜索对于计算机博弈至关重要。优秀的搜索算法通过搜索较少的节点就可以获得最佳路径,从而提高计算机的博弈水平。论文以中国象棋计算机博弈作为背景,在alpha-beta基本搜索算法上,详细阐述了置换表启发算法的原理和哈希冲突,引进了双层置换表的概念及其替换策略,增强了引擎的搜索效率。实验结果表明了该算法的有效性。  相似文献   

9.
姜秉序  宿翀  刘存志  陈捷 《自动化学报》2020,46(6):1240-1254
传统的序列决策方法旨在对决策过程与决策步骤进行建模, 以求解得到最优的决策序列. 然而, 序列决策建模过程对目标函数的确定性要求高, 且序列搜索的算法多以深度优先或广度优先等遍历搜索为主, 鲜有考虑搜索过程的随机性. 蒙特卡洛树搜索算法(Monte Carlo tree search, MCTS)虽然适合求解随机序列搜索问题, 但目前仅应用于博弈型搜索过程, 鲜有探讨需要专家参与的知识约束序列决策的搜索策略, 另外, 传统MCTS算法往往存在搜索范围过大、收敛不及时等问题. 为此, 提出一种融合群决策经验型知识和部分确定型决策序列片段的混合知识约束的MCTS 序列决策方法, 并给出了详细的求解流程. 最后, 将所提方法应用于一类中风后吞咽功能障碍针灸穴位排序方案制订问题, 给出了融合混合知识与MCTS的针灸排序方案设定方法, 并与其他方法进行对比, 验证了所提方法的可行性和有效性, 为年轻医师的针灸方案制订技能的标准化培训工作奠定了方法基础.  相似文献   

10.
针对传统布谷鸟搜索算法(cuckoo search,CS)对复杂问题收敛精度低、迭代步幅局限性大的特点,提出了基于骑手优化的动态布谷鸟搜索策略(rider optimization cuckoo search,ROCS)。结合骑手优化算法(rider optimization algorithm,ROA)思想,利用多种群在单周期内进行多策略寻优,动态使用最优策略进行加强搜索,提高算法对复杂问题的收敛效率;同时对Lévy飞行运动进行动态参数调节,改善算法搜索前期及末期表现。仿真测试结果显示,改进算法对复杂问题的优化表现优于比对算法,算法寻优效率得到显著改善。  相似文献   

11.
Partial information games are excellent examples of decision making under uncertainty. In particular, some games have such an immense state space and high degree of uncertainty that traditional algorithms and methods struggle to play them effectively. Monte Carlo tree search (MCTS) has brought significant improvements to the level of computer programs in games such as Go, and it has been used to play partial information games as well. However, there are certain games with particularly large trees and reduced information in which a naive MCTS approach is insufficient: in particular, this is the case of games with long matches, dynamic information, and complex victory conditions. In this paper we explore the application of MCTS to a wargame-like board game, Kriegspiel. We describe and study three MCTS-based methods, starting from a very simple implementation and moving to more refined versions for playing the game with little specific knowledge. We compare these MCTS-based programs to the strongest known minimax-based Kriegspiel program, obtaining significantly better experimental results with less domain-specific knowledge.  相似文献   

12.
欧俊臣  沙玲  杨淞文 《软件》2020,(4):160-164
随着AlphaGo的诞生,人机对弈和人工智能再次成为研究热点。传统的MCTS(蒙特卡洛树搜索)虽然在迭代搜索方面已有良好的成效,但由于五子棋搜索空间较大,算法极易陷入局部最优化问题,且耗时严重。我们用MCTS和卷积神经网络上设计的策略系统,让其与MCTS进行训练(self-play),使五子棋的策略系统能在一定时间内对自身进行升级,然后又回来继续训练自身,这样得到的五子棋策略系统不仅比传统的MCTS更具有即时性,棋力也更强。  相似文献   

13.
Deep neural network has made amazing achievements in various foreign games. In recent years, convolutional neural network has gained great attention because of its unique unit structure, and has been frequently used in game AI agents, such as AlphaGo and Cold Flutter Masters. “Fighting the Landlord” is a typical cooperative game based on incomplete information. In this paper, a 7-layer convolutional neural network DDZ-CNN is designed to train the network with nearly 300,000 pieces of data based on the self-gaming of “Fighting the Landlord” based on Monte Carlo tree to learn the “Fighting the Landlord” strategy. In the training process, the training data are down sampled by a weight-based method to overcome the problem of uneven distribution, and the network can converge quickly. Finally, the trained model is combated with intelligent MCTS model and real person, and a good winning rate is obtained, which verifies the effectiveness and feasibility of the algorithm in this paper.  相似文献   

14.
《Artificial Intelligence》2002,134(1-2):277-311
In this article we present an overview on the state of the art in games solved in the domain of two-person zero-sum games with perfect information. The results are summarized and some predictions for the near future are given. The aim of the article is to determine which game characteristics are predominant when the solution of a game is the main target. First, it is concluded that decision complexity is more important than state-space complexity as a determining factor. Second, we conclude that there is a trade-off between knowledge-based methods and brute-force methods. It is shown that knowledge-based methods are more appropriate for solving games with a low decision complexity, while brute-force methods are more appropriate for solving games with a low state-space complexity. Third, we found that there is a clear correlation between the first-player's initiative and the necessary effort to solve a game. In particular, threat-space-based search methods are sometimes able to exploit the initiative to prove a win. Finally, the most important results of the research involved, the development of new intelligent search methods, are described.  相似文献   

15.
This paper shows the application results of single-player Monte Carlo tree search (SP-MCTS), an alternative of MCTS, for a practical reentrant scheduling problem addressed by our previous works. Especially in this paper, worker’s IF-THEN knowledge evaluation method with SP-MCTS is proposed. SP-MCTS is thought to be a meta-heuristic algorithm for NP-completeness problems, because SP-MCTS is applicable for any types of problems, where the problem’s state changes determinately in each discrete time. Therefore, SP-MCTS might be useful for not only perfect-information one-player games, but also other problems, as long as search spaces are describable as a tree structure, and the solution is stoppable with the termination determinably. The authors have considered that SP-MCTS is useful for obtaining/evaluating knowledge. This paper first describes the basic idea of SP-MCTS, and second shows the detail of the scheduling problem, including formulation. After, this paper examines the availability of SP-MCTS for a practical problem. Especially, the potentiality of SP-MCTS for knowledge evaluation is discussed from the experimental results.  相似文献   

16.
MonteCloPi算法是一种基于蒙特卡洛树搜索(Monte Carlo tree search, MCTS)的任意时间子群发现算法,旨在使用MCTS策略构建非对称的最佳优先搜索树来发现高质量的多样性模式集,但是限制了目标为二值变量.为此,本文结合了数值目标的特点,通过为置信度上界(upper confidence bound, UCB)公式选取合适的C值、动态调整各个样本的拓展权重并对搜索树进行剪枝、使用自适应top-k均值更新策略,将MonteCloPi算法拓展到了数值目标.最后,在UCI数据集、全国健康与营养调查(national health and nutrition examination survey, NHANES)听力测试数据集上的实验结果表明本文的算法相比其他算法可以发现更高质量的多样性模式集,并且最优子群的可解释性也更好.  相似文献   

17.
围棋机器博弈是机器博弈中重要的分支之一,其庞大的博弈空间给机器博弈研究者带来了巨大挑战.目前围棋机器博弈多采用静态估值搜索与蒙特卡洛树搜索,故将时间差分算法引入至九路围棋机器博弈系统中,提出基于时间差分算法的围棋机器博弈系统模型,该博弈系统具有一定的自学习能力,能在不断的对弈中逐步提高博弈能力.通过与采用α-β搜索算法的博弈系统进行实际对弈,证明了该方法的可行性.  相似文献   

18.
冯浩  郭彩丽 《计算机工程》2022,48(1):135-141+148
视频数据能够为车辆的智能网联化提供丰富的信息,为了更好地提取视频内容并使卸载后的视频中包含更多的有效信息,在时延约束条件下,设计一种内容驱动的计算卸载指导方式并提出基于改进蒙特卡洛树搜索的计算卸载决策算法。在车辆端通过关键帧提取来对视频内容进行预处理,以有效分析视频内容理解任务的重要性,使得更重要的任务能够获得更多的计算资源。采用基于强化学习的启发式搜索算法完成计算卸载决策,并引入深度神经网络预训练先验转移概率,从而优化算法的收敛速度并降低计算复杂度。实验结果表明,该算法能够在时延约束下有效降低能耗并提升视频内容理解精度,相比基于Q-learning、基于模拟退火的算法,其收敛速度更快,计算复杂度更低,在700 ms时延约束下系统总效用达到37%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号