首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
强化学习是提高机器人完成任务效率的有效方法,目前比较流行的学习方法一般采用累积折扣回报方法,但平均值回报在某些方面更适于多机器人协作。累积折扣回报方法在机器人动作层次上可以提高性能,但在多机器人任务层次上却不会得到很好的协作效果,而采用平均回报值的方法,就可以改变这种状态。本文把基于平均值回报的蒙特卡罗学习应用于多机器人合作中,得到很好的学习效果,实际机器人实验结果表明,采用平均值回报的方法优于累积折扣回报方法。  相似文献   

2.
具有环境自适应能力的多机器人编队系统研究   总被引:4,自引:0,他引:4  
张汝波  王兢  孙世良 《机器人》2004,26(1):69-073
对多机器人的体系结构进行了研究.采用时空表和时间控制器相结合的方法,解决多机器人间的协调协作问题.针对编队问题的具体特性,提出了基于环境的记忆学习方法,使多机器人编队系统具有较强的环境自适应能力.最后,通过仿真实验实现了整个多机器人系统,进一步验证了各个算法的可行性和有效性.􀁱  相似文献   

3.
聂仙丽  蒋平  陈辉堂 《机器人》2003,25(4):308-312
本文在机器人具备基本运动技能的基础上[1],采用基于指令教导的学习方法.通 过自然语言教会机器人完成抽象化任务,并以程序体方式保存所学知识,也即通过自然语言 对话自动生成程序流.通过让机器人完成导航等任务,验证所提自然语言编程方法的可行性 .  相似文献   

4.
强化学习在机器人足球比赛中的应用   总被引:8,自引:1,他引:8  
机器人足球比赛是一个有趣并且复杂的新兴的人工智能研究领域 ,它是一个典型的多智能体系统。采用强化学习方法研究了机器人足球比赛中的足球机器人的动作选择问题 ,扩展了单个Agent的强化学习方法 ,提出了基于多Agents的强化学习方法 ,最后给出了实验结果。  相似文献   

5.
危险作业机器人关键技术综述   总被引:6,自引:0,他引:6  
随着社会的发展,人们越来越希望用机器人代替自己完成那些枯燥、繁重、危险的工作。机器人可以分为工业机器人和特种机器人,工业机器人是指面向工业领域的多关节机械手或多自由度机器人。特种机器人包括:服务机器人、教育机器人、娱乐机器人、军用机器人、农业机器人等。机器人不是简单代替人工作,而是去做一些完成不适合人直接干、干小了和干不好的一些工作。比如:机器人可以进入病人体内进行检查和治疗,  相似文献   

6.
针对矿井灾难环境特点,采用三维建模软件设计了一种轮腿一体化机器人.该机器人采用轮腿一体式结构,具备了腿式机器人和轮式机器人的运动优点.分析了在不同环境下机器人采用的行进方式(即机器人步态),增强了机器人的环境适应能力,并且设计了基于多传感器信息的运动控制系统.该系统能够完成灾难矿井下的环境探测、信息获取以及机器人步态控制等功能,为矿难救援工作提供了重要的信息.  相似文献   

7.
采掘机器人的模糊监督——神经网络控制器技术   总被引:1,自引:0,他引:1  
龚向东  王建治 《机器人》1996,18(5):316-320
介绍一种基于规则的自学习神经网络控制器在采掘机器人上的应用。它根据实时执行的结果,采用多步学习-模糊监督学习方法,修正神经网络的教师信号,使控制算法简化,提高了计算的实时性,加快了学习速度实验验证了采用该方法取得的一些结果。  相似文献   

8.
面向多机器人遥操作的分布式预测图形仿真系统   总被引:3,自引:0,他引:3  
在遥操作机器人系统中,由于存在通信传输时延,可能导致控制系统不稳定,从而降 低遥操作的效率和安全性.目前多采用预测仿真的方法来克服.在多机器人遥操作系统不但 要克服时延的影响,还要能控制机器人协调地完成遥操作任务.我们开发了一个面向多操作 者 多机器人遥操作的分布式预测图形仿真系统,实现了对多机器人遥操作系统的预测仿真 ,多个操作者可以通过人机交互接口遥控各自的机器人,相互协调完成遥操作的任务.初步 的实验表明该系统能够克服时延的影响,并能实现多操作者 多机器人的协调遥操作.这对 空间站机器人科学实验、多航行器对接等方面的研究有理论参考价值.  相似文献   

9.
在多机器人协同搬运过程中,针对传统的强化学习算法仅使用数值分析却忽略了推理环节的问题,将多机器人的独立强化学习与“信念-愿望-意向”(BDI)模型相结合,使得多机器人系统拥有了逻辑推理能力,并且,采用距离最近原则将离障碍物最近的机器人作为主机器人,并指挥从机器人运动,提出随多机器人系统位置及最近障碍物位置变化的评价函数,同时将其与基于强化学习的行为权重结合运用,在多机器人通过与环境不断交互中,使行为权重逐渐趋向最佳。仿真实验表明,该方法可行,能够成功实现协同搬运过程。  相似文献   

10.
四腿机器人步态参数自动进化研究与实现   总被引:3,自引:0,他引:3  
采用进化算法和基于自主视觉的适应度评估方法,实现了四腿机器人在RoboCup机器人足球比赛现场的行走步态在线自动进化.我们引入内推法作为交叉方法,利用PC基站进行进化算法计算和流程主控,并采用了一些学习时间缩减策略.实现了进化学习的连续性和可扩展性,使得学习过程可以在4060min内完成,这样就能在比赛现场对ERS-7四足机器人进行行走再学习,提高了行走控制的适应性.算法最终结果使ERS-7型四足机器人的行走速度从27cm/s提升到43cm/s.  相似文献   

11.
Reinforcement learning (RL) is a popular method for solving the path planning problem of autonomous mobile robots in unknown environments. However, the primary difficulty faced by learning robots using the RL method is that they learn too slowly in obstacle-dense environments. To more efficiently solve the path planning problem of autonomous mobile robots in such environments, this paper presents a novel approach in which the robot’s learning process is divided into two phases. The first one is to accelerate the learning process for obtaining an optimal policy by developing the well-known Dyna-Q algorithm that trains the robot in learning actions for avoiding obstacles when following the vector direction. In this phase, the robot’s position is represented as a uniform grid. At each time step, the robot performs an action to move to one of its eight adjacent cells, so the path obtained from the optimal policy may be longer than the true shortest path. The second one is to train the robot in learning a collision-free smooth path for decreasing the number of the heading changes of the robot. The simulation results show that the proposed approach is efficient for the path planning problem of autonomous mobile robots in unknown environments with dense obstacles.  相似文献   

12.
Considering the wide range of possible behaviours to be acquired for domestic robots, applying a single learning method is clearly insufficient. In this paper, we propose a new strategy for behaviour acquisition for domestic robots where the behaviours are acquired using multiple differing learning methods that are subsequently incorporated into a common behaviour selection system, enabling them to be performed in appropriate situations. An example of the implementation of this strategy applied to the entertainment humanoid robot QRIO is introduced and the results are discussed.  相似文献   

13.
In this paper, we tackle the problem of multimodal learning for autonomous robots. Autonomous robots interacting with humans in an evolving environment need the ability to acquire knowledge from their multiple perceptual channels in an unsupervised way. Most of the approaches in the literature exploit engineered methods to process each perceptual modality. In contrast, robots should be able to acquire their own features from the raw sensors, leveraging the information elicited by interaction with their environment: learning from their sensorimotor experience would result in a more efficient strategy in a life-long perspective. To this end, we propose an architecture based on deep networks, which is used by the humanoid robot iCub to learn a task from multiple perceptual modalities (proprioception, vision, audition). By structuring high-dimensional, multimodal information into a set of distinct sub-manifolds in a fully unsupervised way, it performs a substantial dimensionality reduction by providing both a symbolic representation of data and a fine discrimination between two similar stimuli. Moreover, the proposed network is able to exploit multimodal correlations to improve the representation of each modality alone.  相似文献   

14.
单个微小型机器人由于自身能力的限制,因此必须多个机器人联合起来才可以完 成指定的任务,所以机器人之间的协作在微操作领域就显得尤其重要。该文利用增强式的 学 习方法,使得微小型机器人具有一定的学习能力,增强了对不确定环境的适应性,并采 用了 一种基于行为的群体自主式微小移动机器人的协作结构,用于机器人的故障排除,仿 真结果 说明了这种体系结构的有效性。  相似文献   

15.
Reinforcement learning is an area of machine learning that does not require detailed teaching signals by a human, which is expected to be applied to real robots. In its application to real robots, the learning processes are required to be finished in a short learning period of time. A reinforcement learning method of model-free type has fast convergence speeds in the tasks such as Sutton’s maze problem that aims to reach the target state in a minimum time. However, these methods are difficult to learn task to keep a stable state as long as possible. In this study, we improve the reward allocation method for the stabilizing control tasks. In stabilizing control tasks, we use the Semi-Markov decision process as an environment model. The validity of our method is demonstrated through simulation for stabilizing control of an inverted pendulum.  相似文献   

16.
《Advanced Robotics》2013,27(1):21-39
This paper explores a fail-safe design for multiple space robots, which enables robots to complete given tasks even when they can no longer be controlled due to a communication accident or negotiation problem. As the first step towards this goal, we propose new reinforcement learning methods that help robots avoid deadlock situations in addition to improving the degree of task completion without communications via ground stations or negotiations with other robots. Through intensive simulations on a truss construction task, we found that our reinforcement learning methods have great potential to contribute towards fail-safe design for multiple space robots in the above case. Furthermore, the simulations revealed the following detailed implications: (i) the first several planned behaviors must not be reinforced with negative rewards even in deadlock situations in order to derive cooperation among multiple robots, (ii) a certain amount of positive rewards added into negative rewards in deadlock situations contributes to reducing the computational cost of finding behavior plans for task completion, and (iii) an appropriate balance between positive and negative rewards in deadlock situations is indispensable for finding good behavior plans at a small computational cost.  相似文献   

17.
基于模糊神经网络的多移动机器人自学习协调系统   总被引:3,自引:0,他引:3  
许海平  孙茂相  尹朝万 《机器人》1999,21(4):260-265
研究多移动机器人的运动规划问题.针对机器人模型 未知或不精确以及环境的动态变化,提出一种自学习模糊控制器(FLC)来进行准确的速度 跟踪.首先通过神经网络的学习训练构造FLC,再由再励学习算法来在线调节FLC的输出,以 校正机器人运动状态,实现安全协调避撞.  相似文献   

18.
In this paper we present a method for two robot manipulators to learn cooperative tasks. If a single robot is unable to grasp an object in a certain orientation, it can only continue with the help of other robots. The grasping can be realized by a sequence of cooperative operations that re-orient the object. Several sequences are needed to handle the different situations in which an object is not graspable for the robot. It is shown that a distributed learning method based on a Markov decision process is able to learn the sequences for the involved robots, a master robot that needs to grasp and a helping robot that supports him with the re-orientation. A novel state-action graph is used to store the reinforcement values of the learning process. Further an example of aggregate assembly shows the generality of this approach.  相似文献   

19.
异质多移动机器人协同技术研究的进展   总被引:1,自引:0,他引:1  
随着移动机器人应用的领域和范围的不断扩展,多移动机器人由于其单个机器人无法比拟的优越性已经越来越受到重视.从体系结构、协作与协调、协作环境感知与定位、重构及机器学习几个重要课题对多移动机器人协同技术进行了综述,尤其侧重于各种技术如何处理和包容团队中的异质性,并分析了本领域中的研究难点问题,最后展望了异质多移动机器人研究的前景与发展趋势.  相似文献   

20.
任燚  陈宗海 《控制与决策》2006,21(4):430-434
多机器人系统中,随着机器人数目的增加.系统中的冲突呈指数级增加.甚至出现死锁.本文提出了基于过程奖赏和优先扫除的强化学习算法作为多机器人系统的冲突消解策略.针对典型的多机器人可识别群体觅食任务.以计算机仿真为手段,以收集的目标物数量为系统性能指标,以算法收敛时学习次数为学习速度指标,进行仿真研究,并与基于全局奖赏和Q学习算法等其他9种算法进行比较.结果表明所提出的基于过程奖赏和优先扫除的强化学习算法能显著减少冲突.避免死锁.提高系统整体性能.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号