面向多智能体博弈的并行蒙特卡洛树搜索算法研究 A parallel Monte Carlo tree searchalgorithm for multi-agent game期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向多智能体博弈的并行蒙特卡洛树搜索算法研究

引用本文：	管延霞,刘逊韵,刘运韬,谢旻,徐新海. 面向多智能体博弈的并行蒙特卡洛树搜索算法研究[J]. 计算机工程与科学, 2022, 44(12): 2128-2133

作者姓名：	管延霞刘逊韵刘运韬谢旻徐新海

作者单位：	(1.国防科技大学计算机学院，湖南长沙 410073；2.军事科学院战争研究院，北京 100091)

摘要：	蒙特卡洛树搜索算法是一种常用的强化学习算法，博弈过程中动态空间的指数级增长是制约该算法学习效率的因素。基于并行方法对蒙特卡洛树搜索算法进行优化，提出基于胜率估值传递的并行蒙特卡洛树搜索算法。改进后的并行博弈搜索策略框架包含一个主进程和多个子进程，其中子进程用于探索，主进程根据子进程传递的胜率估值数据进行决策。结合多智能体博弈平台Pommerman进行实验验证，与传统的蒙特卡罗树搜索算法相比，并行蒙特卡罗树搜索算法有效提高了资源利用率、博弈胜率及决策效率。
关键词：	多智能体博弈 Pommerman 多进程并行蒙特卡洛树搜索
收稿时间：	2021-04-02
修稿时间：	2021-09-24
A parallel Monte Carlo tree searchalgorithm for multi-agent game

GUAN Yan-xia,LIU Xun-yun,LIU Yun-tao,Xie Min,XU Xin-hai. A parallel Monte Carlo tree searchalgorithm for multi-agent game[J]. Computer Engineering & Science, 2022, 44(12): 2128-2133

Authors:	GUAN Yan-xia LIU Xun-yun LIU Yun-tao Xie Min XU Xin-hai

Affiliation:	(1.College of Computer Science and Technology,National University of Defense Technology,Changsha 410073;2.War Research Institute,Academy of Military Sciences,Beijing 100091,China)

Abstract:	Monte Carlo tree search algorithm is a commonly used reinforcement learning algorithm, and the exponential growth of the dynamic space of the algorithm in the game process has become a factor that restricts the improvement of the algorithm learning efficiency. Based on the parallel approach to optimize the Monte Carlo tree search algorithm, a parallel Monte Carlo tree search algorithm based on the transfer of winning rate estimate is proposed. The improved parallel game search strategy framework consists of one main process and several sub-processes, in which the sub-processes are used for exploration, and the main process makes decisions according to the winning rate estimate data transmitted by the sub-processes. Combined with the multi-agent game platform Pommerman for experimental validation, the parallel Monte Carlo tree search algorithm can enhance the resource utilization rate, game-winning rate, and decision-making efficiency over the traditional Monte Carlo tree search algorithm.

Keywords:	multi-agent game Pommerman multi-process parallel Monte Carlo tree search

	点击此处可从《计算机工程与科学》浏览原始摘要信息
	点击此处可从《计算机工程与科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏