首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
自主导航是移动机器人的一项关键技术。该文采用强化学习结合模糊逻辑的方法实现了未知环境下自主式移动机机器人的导航控制。文中首先介绍了强化学习原理,然后设计了一种未知环境下机器人导航框架。该框架由避碰模块、寻找目标模块和行为选择模块组成。针对该框架,提出了一种基于强化学习和模糊逻辑的学习、规划算法:在对避碰和寻找目标行为进行独立学习后,利用超声波传感器得到的环境信息进行行为选择,使机器人在成功避碰的同时到达目标点。最后通过大量的仿真实验,证明了算法的有效性。  相似文献   

2.
单个微小型机器人由于自身能力的限制,因此必须多个机器人联合起来才可以完 成指定的任务,所以机器人之间的协作在微操作领域就显得尤其重要。该文利用增强式的 学 习方法,使得微小型机器人具有一定的学习能力,增强了对不确定环境的适应性,并采 用了 一种基于行为的群体自主式微小移动机器人的协作结构,用于机器人的故障排除,仿 真结果 说明了这种体系结构的有效性。  相似文献   

3.
双轮驱动移动机器人的学习控制器设计方法*   总被引:1,自引:0,他引:1  
提出一种基于增强学习的双轮驱动移动机器人路径跟随控制方法,通过将机器人运动控制器的优化设计问题建模为Markov决策过程,采用基于核的最小二乘策略迭代算法(KLSPI)实现控制器参数的自学习优化。与传统表格型和基于神经网络的增强学习方法不同,KLSPI算法在策略评价中应用核方法进行特征选择和值函数逼近,从而提高了泛化性能和学习效率。仿真结果表明,该方法通过较少次数的迭代就可以获得优化的路径跟随控制策略,有利于在实际应用中的推广。  相似文献   

4.
A multi-agent reinforcement learning algorithm with fuzzy policy is addressed in this paper. This algorithm is used to deal with some control problems in cooperative multi-robot systems. Specifically, a leader-follower robotic system and a flocking system are investigated. In the leader-follower robotic system, the leader robot tries to track a desired trajectory, while the follower robot tries to follow the reader to keep a formation. Two different fuzzy policies are developed for the leader and follower, respectively. In the flocking system, multiple robots adopt the same fuzzy policy to flock. Initial fuzzy policies are manually crafted for these cooperative behaviors. The proposed learning algorithm finely tunes the parameters of the fuzzy policies through the policy gradient approach to improve control performance. Our simulation results demonstrate that the control performance can be improved after the learning.  相似文献   

5.
The distributed autonomous robotic system has superiority of robustness and adaptability to dynamical environment, however, the system requires the cooperative behavior mutually for optimality of the system. The acquisition of action by reinforcement learning is known as one of the approaches when the multi-robot works with cooperation mutually for a complex task. This paper deals with the transporting problem of the multi-robot using Q-learning algorithm in the reinforcement learning. When a robot carries luggage, we regard it as that the robot leaves a trace to the own migrational path, which trace has feature of volatility, and then, the other robot can use the trace information to help the robot, which carries luggage. To solve these problems on multi-agent reinforcement learning, the learning control method using stress antibody allotment reward is used. Moreover, we propose the trace information of the robot to urge cooperative behavior of the multi-robot to carry luggage to a destination in this paper. The effectiveness of the proposed method is shown by simulation. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

6.
自主微小型移动机器人的协作学习研究是多智能体机器人系统理论的主要研究方向。因为单个微小型移动机器人能力有限,所以机器人之间的协作在某些重要的基础工业和生物医学领域方面显得非常重要。该文介绍了几种用于协作学习的方法并且比较了它们之间的优点和缺点。最后,简要介绍了一些研究工作。  相似文献   

7.
In this paper, we propose fuzzy logic-based cooperative reinforcement learning for sharing knowledge among autonomous robots. The ultimate goal of this paper is to entice bio-insects towards desired goal areas using artificial robots without any human aid. To achieve this goal, we found an interaction mechanism using a specific odor source and performed simulations and experiments [1]. For efficient learning without human aid, we employ cooperative reinforcement learning in multi-agent domain. Additionally, we design a fuzzy logic-based expertise measurement system to enhance the learning ability. This structure enables the artificial robots to share knowledge while evaluating and measuring the performance of each robot. Through numerous experiments, the performance of the proposed learning algorithms is evaluated.  相似文献   

8.
A classifier system for the reinforcement learning control of autonomous mobile robots is proposed. The classifier system contains action selection, rules reproduction, and credit assignment mechanisms. An important feature of the classifier system is that it operates with continuous sensor and action spaces. The system is applied to the control of mobile robots. The local controllers use independent classifiers specified at the wheel-level. The controllers work autonomously, and with respect to each other represent dynamic systems connected through the external environment. The feasibility of the proposed system is tested in an experiment with a Khepera robot. It is shown that some patterns of global behavior can emerge from locally organized classifiers. This work was presented, in part, at the Third International Symposium on Artificial Life and Robotics, Oita, Japan, January 19–21, 1998  相似文献   

9.
基于多智能体系统理论,研究在确定环境下,面向任务的多机器人协调控制系统的实现原理、方法和技术,以及控制集成方法。开发了机器人协调控制的实验平台,对规划、控制、传感、通讯、协调与合作的各关键技术进行了开发和集成,完成了地面2个机器人的实时跟踪,三机器人的协调运动控制。通过实验研究使多机器人分布式协调技术的研究能够直接面向实际应用。  相似文献   

10.
多智能体强化学习及其在足球机器人角色分配中的应用   总被引:2,自引:0,他引:2  
足球机器人系统是一个典型的多智能体系统, 每个机器人球员选择动作不仅与自身的状态有关, 还要受到其他球员的影响, 因此通过强化学习来实现足球机器人决策策略需要采用组合状态和组合动作. 本文研究了基于智能体动作预测的多智能体强化学习算法, 使用朴素贝叶斯分类器来预测其他智能体的动作. 并引入策略共享机制来交换多智能体所学习的策略, 以提高多智能体强化学习的速度. 最后, 研究了所提出的方法在足球机器人动态角色分配中的应用, 实现了多机器人的分工和协作.  相似文献   

11.
Patrolling indoor infrastructures with a team of cooperative mobile robots is a challenging task, which requires effective multi-agent coordination. Deterministic patrol circuits for multiple mobile robots have become popular due to their exceeding performance. However their predefined nature does not allow the system to react to changes in the system’s conditions or adapt to unexpected situations such as robot failures, thus requiring recovery behaviors in such cases. In this article, a probabilistic multi-robot patrolling strategy is proposed. A team of concurrent learning agents adapt their moves to the state of the system at the time, using Bayesian decision rules and distributed intelligence. When patrolling a given site, each agent evaluates the context and adopts a reward-based learning technique that influences future moves. Extensive results obtained in simulation and real world experiments in a large indoor environment show the potential of the approach, presenting superior results to several state of the art strategies.  相似文献   

12.
基于情感与环境认知的移动机器人自主导航控制   总被引:2,自引:0,他引:2  
将基于情感和认知的学习与决策模型引入到基于行为的移动机器人控制体系中, 设计了一种新的自主导航控制系统. 将动力学系统方法用于基本行为设计, 并利用ART2神经网络实现对连续的环境感知状态的分类, 将分类结果作为学习与决策算法中的环境认知状态. 通过在线情感和环境认知学习, 形成合理的行为协调机制. 仿真表明, 情感和环境认知能明显地改善学习和决策过程效率, 提高基于行为的移动机器人在未知环境中的自主导航能力  相似文献   

13.
The Behavior Based Locomotion Controller (BBLC) extends the applicability of the behavior based control (BBC) architecture to redundant systems with multiple task-space motions. A set of control behaviors are attributed to each task-space motion individually and a reinforcement learning algorithm is used to select the combination of behaviors which can achieve the control objective. The resulting behavior combination is an emergent control behavior robust to unknown environments due to the added learning capability. Hence, the BBLC is applicable to complex redundant systems operating in unknown environments, where the emergent control behaviors can satisfy higher level control objectives such as balance in locomotion. The balance control problem of two robotic systems, a bipedal robot walker and a mobile manipulator, are used to study the performance of this controller. Results show that the BBLC strategy can generate emergent balancing strategies capable of adapting to new unknown disturbances from the environment, using only a small fixed library of balancing behaviors.  相似文献   

14.
To achieve efficient and objective search tasks in an unknown environment, a cooperative search strategy for distributed autonomous mobile robots is developed using a behavior‐based control framework with individual and group behaviors. The sensing information of each mobile robot activates the individual behaviors to facilitate autonomous search tasks to avoid obstacles. An 802.15.4 ZigBee wireless sensor network then activates the group behaviors that enable cooperative search among the mobile robots. An unknown environment is dynamically divided into several sub‐areas according to the locations and sensing data of the autonomous mobile robots. The group behaviors then enable the distributed autonomous mobile robots to scatter and move in the search environment. The developed cooperative search strategy successfully reduces the search time within the test environments by 22.67% (simulation results) and 31.15% (experimental results).  相似文献   

15.
Robust motion control is fundamental to autonomous mobile robots. In the past few years, reinforcement learning (RL) has attracted considerable attention in the feedback control of wheeled mobile robot. However, it is still difficult for RL to solve problems with large or continuous state spaces, which is common in robotics. To improve the generalization ability of RL, this paper presents a novel hierarchical RL approach for optimal path tracking of wheeled mobile robots. In the proposed approach, a graph Laplacian-based hierarchical approximate policy iteration (GHAPI) algorithm is developed, in which the basis functions are constructed automatically using the graph Laplacian operator. In GHAPI, the state space of an Markov decision process is divided into several subspaces and approximate policy iteration is carried out on each subspace. Then, a near-optimal path-tracking control strategy can be obtained by GHAPI combined with proportional-derivative (PD) control. The performance of the proposed approach is evaluated by using a P3-AT wheeled mobile robot. It is demonstrated that the GHAPI-based PD control can obtain better near-optimal control policies than previous approaches.  相似文献   

16.
Reinforcement learning (RL) is a popular method for solving the path planning problem of autonomous mobile robots in unknown environments. However, the primary difficulty faced by learning robots using the RL method is that they learn too slowly in obstacle-dense environments. To more efficiently solve the path planning problem of autonomous mobile robots in such environments, this paper presents a novel approach in which the robot’s learning process is divided into two phases. The first one is to accelerate the learning process for obtaining an optimal policy by developing the well-known Dyna-Q algorithm that trains the robot in learning actions for avoiding obstacles when following the vector direction. In this phase, the robot’s position is represented as a uniform grid. At each time step, the robot performs an action to move to one of its eight adjacent cells, so the path obtained from the optimal policy may be longer than the true shortest path. The second one is to train the robot in learning a collision-free smooth path for decreasing the number of the heading changes of the robot. The simulation results show that the proposed approach is efficient for the path planning problem of autonomous mobile robots in unknown environments with dense obstacles.  相似文献   

17.
This paper introduces a nonlinear oscillator scheme to control autonomous mobile robots. The method is based on observations of a successful control mechanism used in nature, the Central Pattern Generator. Simulations were used to assess the performance of oscillator controller when used to implement several behaviors in an autonomous robot operating in a closed arena. A sequence of basic behaviors (random wandering, obstacle avoidance and light following) was coordinated in the robot to produce the higher behavior of foraging for light. The controller is explored in simulations and tests on physical robots. It is shown that the oscillator—based controller outperforms a reactive controller in the tasks of exploring an arena with irregular walls and in searching for light.  相似文献   

18.
多机器人动态编队的强化学习算法研究   总被引:8,自引:0,他引:8  
在人工智能领域中,强化学习理论由于其自学习性和自适应性的优点而得到了广泛关注.随着分布式人工智能中多智能体理论的不断发展,分布式强化学习算法逐渐成为研究的重点.首先介绍了强化学习的研究状况,然后以多机器人动态编队为研究模型,阐述应用分布式强化学习实现多机器人行为控制的方法.应用SOM神经网络对状态空间进行自主划分,以加快学习速度;应用BP神经网络实现强化学习,以增强系统的泛化能力;并且采用内、外两个强化信号兼顾机器人的个体利益及整体利益.为了明确控制任务,系统使用黑板通信方式进行分层控制.最后由仿真实验证明该方法的有效性.  相似文献   

19.
In this paper, we propose a distributed dynamic correlation matrix based multi-Q (D-DCM-Multi-Q) learning method for multi-robot systems. First, a dynamic correlation matrix is proposed for multi-agent reinforcement learning, which not only considers each individual robot’s Q-value, but also the correlated Q-values of neighboring robots. Then, the theoretical analysis of the system convergence for this D-DCM-Multi-Q method is provided. Various simulations for multi-robot foraging as well as a proof-of-concept experiment with a physical multi-robot system have been conducted to evaluate the proposed D-DCM-Multi-Q method. The extensive simulation/experimental results show the effectiveness, robustness, and stability of the proposed method.  相似文献   

20.
目的是研究异构多机器人系统中机器人之间的协作过程.基于足球比赛案例,将异构多机器人系统的任务分为找球、跟随、踢球等几个作业.以人形机器人和轮式机器人作为研究对象,并赋予不同的功能,对机器人能力进行建模.讨论如何以优化的方案分配给执行任务的机器人,并建立了一种参考模型.最后,以流程图方式说明了机器人的行为控制.实践表明,由具有不同能力的机器人共同协作可以更加有效地完成任务.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号