期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李智也《计算机工程》2006,32(1):189-192

针对机器人在未知环境中对障碍物的发现与学习以及对路径的规划问题，提出一个新的解决方案，即增加机器人对环境信息的提取并建立地图模型的过程，采用判断当前位置与目标位置的连接线段与所建立的地图模型中障碍物冲突关系的方法，将机器人的行为分为两类：沿障碍物边界行走或向目标点位置移动。同时提出一种新的机器人面对障碍物时对运动方向的计算、选择方法与一种简化学习障碍物信息并建立地图模型的方法。相似文献

2.

无人仓多搬运机器人协同作业轨迹自动控制研究

下载免费PDF全文

李诣坤韦思亮《计算机测量与控制》2023,31(2):115-121

由于无人仓多搬运机器人协同作业线路较为复杂,导致协同作业轨迹控制难度增加,为了保证多搬运机器人能够按照规划路线执行搬运作业,提出了无人仓多搬运机器人协同作业轨迹自动控制方法。采用栅格图建模法,结合无人仓内货架的实际分布情况,建立无人仓环境场景。从组成结构、运动学以及动力学三个方面,构建搬运机器人的数学模型。遵循就近原则分配多机器人搬运任务,规划多搬运机器人的协同作业轨迹,根据多搬运机器人实时位姿的自动检测结果计算控制量,利用作业轨迹自动控制器的安装与运行,完成无人仓多搬运机器人协同作业轨迹的自动控制任务。实验结果表明,在该方法应用后,多搬运机器人在无人仓中的作业轨迹与规划轨迹基本相同,计算得出的平均位置控制误差和姿态角控制误差分别为2.27cm和0.05°,搬运机器人的碰撞次数能被控制在规定范围内,实际应用效果好。相似文献

3.

足球机器人动态路径规划方法研究

刘雪飘李孝安《计算机测量与控制》2006,14(8):1103-1105

针对机器人足球系统的高度实时性、不确定性,提出了一种基于统计预测的路径规划方法,该方法考虑到障碍物的速度大小和方向的不确定性,用数学统计的方法对障碍物的运动进行建模;机器人在运动过程中,根据得到的环境信息在机器视觉范围内建立预测窗口和避障窗口,在预测窗口内,机器人根据障碍物的信息建立障碍物的预测区域,在避障窗口内,机器人根据自身的位置与障碍物的预测区域,分别调用切线法或滚动窗口法进行路径规划;该方法属于局部路径规划方法,机器人在移动过程中需要不断更新环境信息来进行避障. 相似文献

4.

双臂搬运机器人反应式导航控制系统设计

下载免费PDF全文

孙辉高剑潘之腾李建梅臧汝静《计算机测量与控制》2022,30(12):149-153

为增强双臂搬运机器人在作业任务过程中的行进避障能力,使其运动行为得到连续有效控制,设计双臂搬运机器人的反应式导航控制系统。根据单片机与电机电路的连接形式,选择合适的ARM微处理器元件与PIC单片机结构,再联合HN-9移动平台、智能导航平台、ROS操作平台,完善反应式导航子模块的运行能力,实现控制系统的硬件单元设计。求取绝对位姿向量、相对位姿向量的计算结果,以此作为自变量系数,确定速度雅可比指标,并推断得出动力学递推表达式,完成对双臂搬运机器人的协调控制,联合相关硬件应用结构,实现双臂搬运机器人反应式导航控制系统的设计。对比实验结果：反应式导航控制系统可使机器人准确躲避行进障碍物,且躲避过程中机器人完成作业任务的能力不会受到影响,符合连续有效控制机器人搬运行为的实际应用需求。相似文献

5.

结构交互驱动的机器人深度强化学习控制方法

余超董银昭郭宪冯旸赫卓汉逵张强《软件学报》2023,34(4):1749-1764

针对深度强化学习在高维机器人行为控制中训练效率低下和策略不可解释等问题,提出一种基于结构交互驱动的机器人深度强化学习方法(structure-motivated interactive deep reinforcement learning, SMILE).首先,利用结构分解方法将高维的单机器人控制问题转化为低维的多关节控制器协同学习问题,从而缓解连续运动控制的维度灾难难题;其次,通过两种协同图模型(ATTENTION和PODT)动态推理控制器之间的关联关系,实现机器人内部关节的信息交互和协同学习;最后,为了平衡ATTENTION和PODT协同图模型的计算复杂度和信息冗余度,进一步提出两种协同图模型更新方法 APDODT和PATTENTION,实现控制器之间长期关联关系和短期关联关系的动态自适应调整.实验结果表明,基于结构驱动的机器人强化学习方法能显著提升机器人控制策略学习效率.此外,基于协同图模型的关系推理及协同机制,可为最终学习策略提供更为直观和有效的解释. 相似文献

6.

基于双层模糊逻辑的多机器人路径规划与避碰 总被引：1，自引：0，他引：1

高翔苏青《计算机技术与发展》2014,(11):79-82

针对无通信情况下的多机器人系统在未知动态环境下的路径规划问题,设计了基于双层模糊逻辑的多机器人路径规划与动态避碰系统。方向模糊控制器充分考虑了障碍物的距离信息和目标的角度信息,转化为机器人与障碍物的碰撞可能性,从而输出转向角度实现机器人的动态避障;速度模糊控制器将障碍物的距离信息作为输入,将速度因子作为输出,提高了多机器人路径规划与动态避碰系统的效率和鲁棒性。在Pioneer3-DX机器人实体上验证了该系统的可行性。相似文献

7.

基于主从结构的多水下机器人协同路径规划

李东正郝燕玲张振兴《计算机仿真》2015,32(1)

关于多水下机器人协同路径规划问题,是多水下机器人协同控制的重要研究内容之一,是一种典型的含多个约束条件的组合优化问题.针对多机器人协同路径规划因约束条件多导致算法复杂度高、耗时、求解困难等问题,提出了一种主从结构的并行多水下机器人协同路径规划算法.进化过程的每一代,子层结构应用粒子群并行算法,生成各架机器人当前的最优路径,同时,主层结构应用微分进化算法实时给出当前考虑机器人与障碍物、机器人与机器人之间避碰情况下,总系统运行时间最短的路径组合方案.上述结构将多约束分解到不同层面,有效地降低了单层结构因过多的约束条件计算时间过长以及不易实现等困难.仿真结果表明,上述算法不仅能在静态环境下生成可行的、优化的组合路径,而且在当障碍物随时间随机移动的动态环境下,也表现出可行的、良好的效果,为求解多水下机器人协同路径规划问题提供了一个高效的解决方案. 相似文献

8.

基于激光雷达的移动机器人动态障碍跟踪

赵利军常清青《计算机测量与控制》2012,20(3):816-819

动态环境下运动物体跟踪是移动机器人研究的难点之一;文章提出了一种基于激光雷达的自主动态障碍检测与跟踪方法;该方法首先利用最近邻聚类法将环境数据聚类为不同的障碍物;然后利用最近邻特征匹配算法关联相邻两帧的障碍物;最后提出一种新的基于障碍物时空关联性分析的的障碍物动静态识别算法,并采用α-β滤波算法对动态障碍的位置和速度进行了估计;利用机器人平台对该方法进行验证,实验结果表明了其有效性。相似文献

9.

基于模糊避障算法的履带式搬运机器人的设计

下载免费PDF全文

蔡青松吴强杜康熙谢自强王肖锋《计算机测量与控制》2018,26(9):62-66

以HC-SR04超声波传感器模块获取机器人周围环境信息,以链式叉车作为搬运货物的执行机构,以STM32F103单片机作为机器人控制器,设计了一种自动搬运并且避障的小型履带式搬运机器人。针对局部静态环境下多障碍物对系统避障的复杂性,引入了模糊控制算法,通过对机器人与障碍物之间的距离进行模糊化,建立模糊规则,实现搬运机器人的避障控制。为了提高机器人的稳定性及避障的可靠性,对直流电机建立了数学模型,并利用积分分离PID算法进行仿真,最后实验结果表明该算法提高了直流电机的控制性能。相似文献

10.

基于简化虚拟受力模型的群机器人多目标搜索协调控制

《机器人》2016,(6)

针对未知凸和非凸障碍物以及动态障碍物环境下群机器人多目标搜索问题,提出了一种基于简化虚拟受力分析模型的循障和避碰方法(SRSMT-SVF).对复杂环境下群机器人多目标搜索行为进行了分解并抽象出简化虚拟受力分析模型.基于此受力模型,设计了个体机器人协同搜索和漫游状态下的运动控制策略,使得机器人在搜索目标的同时能够实时避碰.通过对不同群体规模系统的仿真实验表明,本文控制方法能够使个体机器人在整个搜索过程中保持良好的避碰性能,有效地减少系统与环境之间和系统内部个体之间的碰撞冲突.相比于扩展粒子群算法(EPSO),本文方法使得搜索耗时和系统能耗至少减少了13.78%、11.96%,数值仿真结果验证了本文方法的有效性. 相似文献

11.

A cooperative behavior learning control of multi-robot using trace information

Tomofumi Ohshita Ji-Sun Shin Michio Miyazaki Hee-Hyol Lee 《Artificial Life and Robotics》2008,13(1):144-147

The distributed autonomous robotic system has superiority of robustness and adaptability to dynamical environment, however, the system requires the cooperative behavior mutually for optimality of the system. The acquisition of action by reinforcement learning is known as one of the approaches when the multi-robot works with cooperation mutually for a complex task. This paper deals with the transporting problem of the multi-robot using Q-learning algorithm in the reinforcement learning. When a robot carries luggage, we regard it as that the robot leaves a trace to the own migrational path, which trace has feature of volatility, and then, the other robot can use the trace information to help the robot, which carries luggage. To solve these problems on multi-agent reinforcement learning, the learning control method using stress antibody allotment reward is used. Moreover, we propose the trace information of the robot to urge cooperative behavior of the multi-robot to carry luggage to a destination in this paper. The effectiveness of the proposed method is shown by simulation. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008 相似文献

12.

A Reinforcement Learning Algorithm in Cooperative Multi-Robot Domains

Fernando Fern??ndez Daniel Borrajo Lynne E. Parker 《Journal of Intelligent and Robotic Systems》2005,43(2-4):161-174

Reinforcement learning has been widely applied to solve a diverse set of learning tasks, from board games to robot behaviours. In some of them, results have been very successful, but some tasks present several characteristics that make the application of reinforcement learning harder to define. One of these areas is multi-robot learning, which has two important problems. The first is credit assignment, or how to define the reinforcement signal to each robot belonging to a cooperative team depending on the results achieved by the whole team. The second one is working with large domains, where the amount of data can be large and different in each moment of a learning step. This paper studies both issues in a multi-robot environment, showing that introducing domain knowledge and machine learning algorithms can be combined to achieve successful cooperative behaviours. 相似文献

13.

XCSG在多机器人强化学习中的应用

邵杰杜丽娟杨静宇《计算机科学》2013,40(8):249-251,292

XCS分类器在解决机器人强化学习方面已显示出较强的能力,但在多机器人领域仅局限于MDP环境,只能解决环境空间较小的学习问题。提出了XCSG来解决多机器人的强化学习问题。XCSG建立低维的逼近函数,梯度下降技术利用在线知识建立稳定的逼近函数,使Q-表格一直保持在稳定低维状态。逼近函数Q不仅所需的存储空间更小,而且允许机器人在线对已获得的知识进行归纳一般化。仿真实验表明,XCSG算法很好地解决了多机器人学习空间大、学习速度慢、学习效果不确定等问题。相似文献

14.

Analysis and solution of a predator–protector–prey multi-robot system by a high-level reinforcement learning architecture and the adaptive systems theory

José Antonio Martín H. Javier de Lope Darío Maravall 《Robotics and Autonomous Systems》2010,58(12):1266-1272

The area of competitive robotic systems usually leads to highly complicated strategies that must be achieved by complex learning architectures since analytic solutions are unpractical or completely unfeasible. In this work we design an experiment in order to study and validate a model about the complex phenomena of adaptation. In particular, we study a reinforcement learning problem that comprises a complex predator–protector–prey system composed by three different robots: a pure bio-mimetic reactive (in Brook’s sense, i.e. without reasoning and representation) predator-like robot, a protector-like robot with reinforcement learning capabilities and a pure bio-mimetic reactive prey-like robot. From the high-level point of view, we are interested in studying whether the Law of Adaptation is useful enough to model and explain the whole learning process occurring in this multi-robot system. From a low-level point of view, our interest is in the design of a learning system capable of solving such a complex competitive predator–protector–prey system optimally. We show how this learning problem can be addressed and solved effectively by means of a reinforcement learning setup that uses abstract actions to select a goal or target towards which a pure bio-mimetic reactive robot must navigate. The experimental results clearly show how the Law of Adaptation fits this complex learning system and that the proposed Reinforcement Learning setup is able to find an optimal policy to control the defender robot in its role of protecting the prey against the predator robot. 相似文献

15.

基于主动风险防御机制的多机器人强化学习协同对抗策略

下载免费PDF全文

孙辉辉胡春鹤张军国《控制与决策》2023,38(5):1420-1429

深度强化学习因其在多机器人系统中的高效表现,已经成为多机器人领域的研究热点.然而,当遭遇连续时变、风险未知的非结构场景时,传统方法暴露出风险防御能力差、系统安全性能脆弱的问题,未知风险将以对抗攻击的形式给多机器人的状态空间带来非线性入侵.针对这一问题,提出一种基于主动风险防御机制的多机器人强化学习方法(APMARL).首先,基于局部可观察马尔可夫博弈模型,建立多机记忆池共享的风险判别机制,通过构建风险状态指数提前预测当前行为的安全性,并根据风险预测结果自适应执行与之匹配的风险处理模式;特别地,针对有风险侵入的非安全状态,提出基于增强型注意力机制的Actor-Critic主动防御网络架构,实现对重点信息的分级增强和危险信息的有效防御.最后,通过广泛的多机协作对抗任务实验表明,具有主动风险防御机制的强化学习策略可以有效降低敌对信息的入侵风险,提高多机器人协同对抗任务的执行效率,增强策略的稳定性和安全性. 相似文献

16.

一种新的多智能体强化学习算法及其在多机器人协作任务中的应用 总被引：1，自引：0，他引：1

顾国昌仲宇张汝波《机器人》2003,25(4):344-348

在多机器人系统中，评价一个机器人行为的好坏常常依赖于其它机器人的行为，此时必须采用组合动作以实现多机器人的协作，但采用组合动作的强化学习算法由于学习空间异常庞大而收敛得极慢．本文提出的新方法通过预测各机器人执行动作的概率来降低学习空间的维数，并应用于多机器人协作任务之中．实验结果表明，基于预测的加速强化学习算法可以比原始算法更快地获得多机器人的协作策略．相似文献

17.

Fuzzy Policy Reinforcement Learning in Cooperative Multi-robot Systems

Dongbing Gu Erfu Yang 《Journal of Intelligent and Robotic Systems》2007,48(1):7-22

A multi-agent reinforcement learning algorithm with fuzzy policy is addressed in this paper. This algorithm is used to deal with some control problems in cooperative multi-robot systems. Specifically, a leader-follower robotic system and a flocking system are investigated. In the leader-follower robotic system, the leader robot tries to track a desired trajectory, while the follower robot tries to follow the reader to keep a formation. Two different fuzzy policies are developed for the leader and follower, respectively. In the flocking system, multiple robots adopt the same fuzzy policy to flock. Initial fuzzy policies are manually crafted for these cooperative behaviors. The proposed learning algorithm finely tunes the parameters of the fuzzy policies through the policy gradient approach to improve control performance. Our simulation results demonstrate that the control performance can be improved after the learning. 相似文献

18.

Distributed Reinforcement Learning for Coordinate Multi-Robot Foraging

Hongliang Guo Yan Meng 《Journal of Intelligent and Robotic Systems》2010,60(3-4):531-551

In this paper, we propose a distributed dynamic correlation matrix based multi-Q (D-DCM-Multi-Q) learning method for multi-robot systems. First, a dynamic correlation matrix is proposed for multi-agent reinforcement learning, which not only considers each individual robot’s Q-value, but also the correlated Q-values of neighboring robots. Then, the theoretical analysis of the system convergence for this D-DCM-Multi-Q method is provided. Various simulations for multi-robot foraging as well as a proof-of-concept experiment with a physical multi-robot system have been conducted to evaluate the proposed D-DCM-Multi-Q method. The extensive simulation/experimental results show the effectiveness, robustness, and stability of the proposed method. 相似文献

19.

基于深度强化学习的机器人运动控制研究进展

董豪杨静李少波王军段仲静《控制与决策》2022,37(2):278-292

复杂未知环境下智能感知与自动控制是目前机器人在控制领域的研究热点之一,而新一代人工智能为其实现智能自动化赋予了可能.近年来,在高维连续状态-动作空间中,尝试运用深度强化学习进行机器人运动控制的新兴方法受到了相关研究人员的关注.首先,回顾了深度强化学习的兴起与发展,将用于机器人运动控制的深度强化学习算法分为基于值函数和策略梯度2类,并对各自典型算法及其特点进行了详细介绍;其次,针对仿真至现实之前的学习过程,简要介绍5种常用于深度强化学习的机器人运动控制仿真平台;然后,根据研究类型的不同,综述了目前基于深度强化学习的机器人运动控制方法在自主导航、物体抓取、步态控制、人机协作以及群体协同等5个方面的研究进展;最后,对其未来所面临的挑战以及发展趋势进行了总结与展望. 相似文献

20.

Learning cooperative grasping with the graph representation of a state-action space

Markus Jianwei 《Robotics and Autonomous Systems》2002,38(3-4):183-195

In this paper we present a method for two robot manipulators to learn cooperative tasks. If a single robot is unable to grasp an object in a certain orientation, it can only continue with the help of other robots. The grasping can be realized by a sequence of cooperative operations that re-orient the object. Several sequences are needed to handle the different situations in which an object is not graspable for the robot. It is shown that a distributed learning method based on a Markov decision process is able to learn the sequences for the involved robots, a master robot that needs to grasp and a helping robot that supports him with the re-orientation. A novel state-action graph is used to store the reinforcement values of the learning process. Further an example of aggregate assembly shows the generality of this approach. 相似文献