首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
在机器人足球中利用遗传算法进行多智能体学习   总被引:5,自引:0,他引:5  
本文通过对仿真机器人足球的研究的介绍,阐述了利用遗传算法对多智能体机器学习的研究。每个球员作为一个智能体,通过进化训练,不断地学习,使之能够作出当前状态下最优的动作。文中将以FIRA仿真机器人足球赛为例,论述战术动作的在线学习。  相似文献   

2.
顾国昌  仲宇  张汝波 《机器人》2003,25(4):344-348
在多机器人系统中,评价一个机器人行为的好坏常常依赖于其它机器人的行为,此 时必须采用组合动作以实现多机器人的协作,但采用组合动作的强化学习算法由于学习空间 异常庞大而收敛得极慢.本文提出的新方法通过预测各机器人执行动作的概率来降低学习空 间的维数,并应用于多机器人协作任务之中.实验结果表明,基于预测的加速强化学习算法 可以比原始算法更快地获得多机器人的协作策略.  相似文献   

3.
李鹏 《计算机与现代化》2009,(8):123-125,129
小型足球机器人的决策系统是一个多智能体协调控制系统,主要由视觉信息处理、协调策略、角色分配、动作实现等组成.本文对角色分配机制进行研究,提出了一种路径开销组合最优的角色分配算法,实现了Play策略下动态角色分配的整体设计.仿真平台测试表明了机器人的整体配合能力有较大提高.  相似文献   

4.
基于局部合作的RoboCup多智能体Q-学习   总被引:2,自引:0,他引:2  
刘亮  李龙澍 《计算机工程》2009,35(9):11-13,1
针对多智能体Q-学习中存在的联合动作指数级增长问题,采用-种局部合作的Q-学习方法,在智能体之间有协作时才考察联合动作,否则只进行简单的个体智能体的Q-学习,从而减少学习时所要考察的状态-动作对值。在机器人足球仿真2D平台上进行的实验表明,该方法比常用多智能体强化学习技术具有更高的效率。  相似文献   

5.
强化学习在机器人足球比赛中的应用   总被引:8,自引:1,他引:8  
机器人足球比赛是一个有趣并且复杂的新兴的人工智能研究领域 ,它是一个典型的多智能体系统。采用强化学习方法研究了机器人足球比赛中的足球机器人的动作选择问题 ,扩展了单个Agent的强化学习方法 ,提出了基于多Agents的强化学习方法 ,最后给出了实验结果。  相似文献   

6.
自适应模糊RBF神经网络的多智能体机器人强化学习   总被引:3,自引:0,他引:3  
多机器人环境中的学习,由于机器人所处的环境是连续状态,连续动作,而且包含多个机器人,因此学习空间巨大,直接应用Q学习算法难以获得满意的结果。文章研究中针对多智能体机器人系统的学习问题,提出自适应模糊RBF神经网络强化学习算法,网络本身具有模糊推理能力、较强的函数逼近能力以及泛化能力,因此,实现了人类专家知识与机器学习方法的结合,减少学习问题的复杂度;实现连续状态空间与动作空间的策略学习。  相似文献   

7.
智能博弈对抗场景中,多智能体强化学习算法存在“非平稳性”问题,智能体的策略不仅取决于环境,还受到环境中对手(其他智能体)的影响。根据对手与环境的交互信息,预测其策略和意图,并以此调整智能体自身策略是缓解上述问题的有效方式。提出一种基于对手动作预测的智能博弈对抗算法,对环境中的对手进行隐式建模。该算法通过监督学习获得对手的策略特征,并将其与智能体的强化学习模型融合,缓解对手对学习稳定性的影响。在1v1足球环境中的仿真实验表明,提出的算法能够有效预测对手的动作,加快学习收敛速度,提升智能体的对抗水平。  相似文献   

8.
多智能体足球机器人策略研究   总被引:1,自引:0,他引:1  
机器人足球比赛的策略是进行机器人足球比赛的最根本的要素.通过对一个在实际仿真机器人足球比赛时使用的策略在FIRA机器人足球比赛5 VS 5仿真平台上的仿真,实现多个智能体机器人相互配合来完成进球的任务.分析了部分策略的实现方式,归纳了不同位置的智能体机器人在使用不同的策略时相互之间的协作关系.仿真结果表明多了该智能体机器人的仿真足球策略要更胜一筹.  相似文献   

9.
潘娅  王牛  许威 《计算机测量与控制》2006,14(10):1368-1370
足球机器人系统是国际上标准的多智能体动作行为的研究平台,基本行为动作是智能体行为的基础;文章在分析FIRA半自主机器人足球系统(MiroSot)的基础上,设计了包含开环类、简单控制类和到定点转向给定角度类三大类、共五种基本动作,介绍了控制算法实现并通过实验得到了它们的运动性能;利用这些动作在时间和空间的组合协调,可以实现复杂的动作行为,并在历次国内比赛中证实了其有效性。  相似文献   

10.
强化学习主要研究智能体如何根据环境作出较好的决策,其核心是学习策略。基于传统策略模型的动作选择主要依赖于状态感知、历史记忆及模型参数等,其智能体行为很难受到控制。然而,当人类智能体完成任务时,通常会根据自身的意愿或动机选择相应的行为。受人类决策机制的启发,为了让强化学习中的行为选择可控,使智能体能够根据意图选择动作,将意图变量加入到策略模型中,提出了一种基于意图控制的强化学习策略学习方法。具体地,通过意图变量与动作的互信息最大化使两者产生高相关性,使得策略能够根据给定意图变量选择相关动作,从而达到对智能体的控制。最终,通过复杂的机器人控制仿真任务Mujoco验证了所提方法能够有效地通过意图变量控制机器人的移动速度和移动角度。  相似文献   

11.
In this paper, a multi-agent reinforcement learning method based on action prediction of other agent is proposed. In a multi-agent system, action selection of the learning agent is unavoidably impacted by other agents’ actions. Therefore, joint-state and joint-action are involved in the multi-agent reinforcement learning system. A novel agent action prediction method based on the probabilistic neural network (PNN) is proposed. PNN is used to predict the actions of other agents. Furthermore, the sharing policy mechanism is used to exchange the learning policy of multiple agents, the aim of which is to speed up the learning. Finally, the application of presented method to robot soccer is studied. Through learning, robot players can master the mapping policy from the state information to the action space. Moreover, multiple robots coordination and cooperation are well realized.  相似文献   

12.
In this paper, we first discuss the meaning of physical embodiment and the complexity of the environment in the context of multi-agent learning. We then propose a vision-based reinforcement learning method that acquires cooperative behaviors in a dynamic environment. We use the robot soccer game initiated by RoboCup (Kitano et al., 1997) to illustrate the effectiveness of our method. Each agent works with other team members to achieve a common goal against opponents. Our method estimates the relationships between a learner's behaviors and those of other agents in the environment through interactions (observations and actions) using a technique from system identification. In order to identify the model of each agent, Akaike's Information Criterion is applied to the results of Canonical Variate Analysis to clarify the relationship between the observed data in terms of actions and future observations. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior policy. The proposed method is applied to a soccer playing situation. The method successfully models a rolling ball and other moving agents and acquires the learner's behaviors. Computer simulations and real experiments are shown and a discussion is given.  相似文献   

13.
The robot soccer game has been proposed as a benchmark problem for the artificial intelligence and robotic researches. Decision-making system is the most important part of the robot soccer system. As the environment is dynamic and complex, one of the reinforcement learning (RL) method named FNN-RL is employed in learning the decision-making strategy. The FNN-RL system consists of the fuzzy neural network (FNN) and RL. RL is used for structure identification and parameters tuning of FNN. On the other hand, the curse of dimensionality problem of RL can be solved by the function approximation characteristics of FNN. Furthermore, the residual algorithm is used to calculate the gradient of the FNN-RL method in order to guarantee the convergence and rapidity of learning. The complex decision-making task is divided into multiple learning subtasks that include dynamic role assignment, action selection, and action implementation. They constitute a hierarchical learning system. We apply the proposed FNN-RL method to the soccer agents who attempt to learn each subtask at the various layers. The effectiveness of the proposed method is demonstrated by the simulation and the real experiments.  相似文献   

14.
This paper proposes an intelligent task planning and action selection mechanism for a mobile robot in a robot soccer system through a fuzzy neural network approach. The proposed fuzzy neural network system is developed through the two dimensional fuzzification of the soccer field. A five layer fuzzy neural network system is trained through error back propagation learning algorithm to impart a strategy based action selection. The action selection depends on the field configuration, and the emergence of a particular field configuration results from the game dynamics. Strategy of the robot changes when the configuration of the objects in the field changes. The proposed fuzzy neural network structure is flexible to accommodate all possible filed configurations. Simulation results indicate that the proposed approach is simple and has the capability in coordinating the multi-agent system through selection of sensible actions.  相似文献   

15.
Multi-agent reinforcement learning methods suffer from several deficiencies that are rooted in the large state space of multi-agent environments. This paper tackles two deficiencies of multi-agent reinforcement learning methods: their slow learning rate, and low quality decision-making in early stages of learning. The proposed methods are applied in a grid-world soccer game. In the proposed approach, modular reinforcement learning is applied to reduce the state space of the learning agents from exponential to linear in terms of the number of agents. The modular model proposed here includes two new modules, a partial-module and a single-module. These two new modules are effective for increasing the speed of learning in a soccer game. We also apply the instance-based learning concepts, to choose proper actions in states that are not experienced adequately during learning. The key idea is to use neighbouring states that have been explored sufficiently during the learning phase. The results of experiments in a grid-soccer game environment show that our proposed methods produce a higher average reward compared to the situation where the proposed method is not applied to the modular structure.  相似文献   

16.
一种新的多智能体Q学习算法   总被引:2,自引:0,他引:2  
郭锐  吴敏  彭军  彭姣  曹卫华 《自动化学报》2007,33(4):367-372
针对非确定马尔可夫环境下的多智能体系统,提出了一种新的多智能体Q学习算法.算法中通过对联合动作的统计来学习其它智能体的行为策略,并利用智能体策略向量的全概率分布保证了对联合最优动作的选择. 同时对算法的收敛性和学习性能进行了分析.该算法在多智能体系统RoboCup中的应用进一步表明了算法的有效性与泛化能力.  相似文献   

17.
基于量子计算的多Agent协作学习算法   总被引:1,自引:0,他引:1  
针对多Agent协作强化学习中存在的行为和状态维数灾问题,以及行为选择上存在多个均衡解,为了收敛到最佳均衡解需要搜索策略空间和协调策略选择问题,提出了一种新颖的基于量子理论的多Agent协作学习算法。新算法借签了量子计算理论,将多Agent的行为和状态空间通过量子叠加态表示,利用量子纠缠态来协调策略选择,利用概率振幅表示行为选择概率,并用量子搜索算法来加速多Agent的学习。相应的仿真实验结果显示新算法的有效性。  相似文献   

18.
This paper proposes a model-free learning scheme for the developmental acquisition of robot kinematic control and dexterous manipulation skills. The approach is based on a nested-hierarchical multi-agent architecture that intuitively encapsulates the topology of robot kinematic chains, where the activity of each independent degree-of-freedom (DOF) is finally mapped onto a distinct agent. Each one of those agents progressively evolves a local kinematic control strategy in a game-theoretic sense, that is, based on a partial (local) view of the whole system topology, which is incrementally updated through a recursive communication process according to the nested-hierarchical topology. Learning is thus approached not through demonstration and training but through an autonomous self-exploration process. A fuzzy reinforcement learning scheme is employed within each agent to enable efficient exploration in a continuous state–action domain. This paper constitutes in fact a proof of concept, demonstrating that global dexterous manipulation skills can indeed evolve through such a distributed iterative learning of local agent sensorimotor mappings. The main motivation behind the development of such an incremental multi-agent topology is to enhance system modularity, to facilitate extensibility to more complex problem domains and to improve robustness with respect to structural variations including unpredictable internal failures. These attributes of the proposed system are assessed in this paper through numerical experiments in different robot manipulation task scenarios, involving both single and multi-robot kinematic chains. The generalisation capacity of the learning scheme is experimentally assessed and robustness properties of the multi-agent system are also evaluated with respect to unpredictable variations in the kinematic topology. Furthermore, these numerical experiments demonstrate the scalability properties of the proposed nested-hierarchical architecture, where new agents can be recursively added in the hierarchy to encapsulate individual active DOFs. The results presented in this paper demonstrate the feasibility of such a distributed multi-agent control framework, showing that the solutions which emerge are plausible and near-optimal. Numerical efficiency and computational cost issues are also discussed.  相似文献   

19.
The distributed autonomous robotic system has superiority of robustness and adaptability to dynamical environment, however, the system requires the cooperative behavior mutually for optimality of the system. The acquisition of action by reinforcement learning is known as one of the approaches when the multi-robot works with cooperation mutually for a complex task. This paper deals with the transporting problem of the multi-robot using Q-learning algorithm in the reinforcement learning. When a robot carries luggage, we regard it as that the robot leaves a trace to the own migrational path, which trace has feature of volatility, and then, the other robot can use the trace information to help the robot, which carries luggage. To solve these problems on multi-agent reinforcement learning, the learning control method using stress antibody allotment reward is used. Moreover, we propose the trace information of the robot to urge cooperative behavior of the multi-robot to carry luggage to a destination in this paper. The effectiveness of the proposed method is shown by simulation. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

20.
在传统Q学习算法基础上引入多智能体系统,提出了多智能体联合Q学习算法。该算法是在同一评价函数下进行多智能体的学习,并且学习过程考虑了参与协作的所有智能体的学习结果。在RoboCup-2D足球仿真比赛中通过引入球场状态分解法减少了状态分量,采用联合学习得到的最优状态作为多智能体协作的最优动作组,有效解决了仿真中各智能体之间的传球策略及其协作问题,仿真和实验结果证明了算法的有效性和可靠性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号