首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 818 毫秒
1.
于建均    姚红柯    左国玉    阮晓钢    安硕   《智能系统学报》2019,14(5):1026-1034
针对当前机器人模仿学习过程中,运动模仿存在无法收敛到目标点以及泛化能力差的问题,引入一种基于动态系统(dynamical system,DS)的模仿学习方法。该方法通过高斯混合模型(gaussian mixture model,GMM)将示教运动数据建模为一非线性动态系统;将DS全局稳定的充分条件作为约束,以保证DS所生成的所有轨迹收敛到目标点;将动态系统模型的参数学习问题转化为求解一个约束优化问题,从而得到模型参数。以7bot机械臂为实验对象,进行仿真实验和机器人实验,实验结果表明:该方法学习的DS模型从不同起点生成的所有轨迹都收敛到目标点,轨迹平滑,泛化能力好。  相似文献   

2.
刘环  钱堃  桂博兴  马旭东 《机器人》2019,41(5):574-582
针对机器人示范学习过程中任务泛化与动作轨迹泛化问题,提出了一种将多演示动作轨迹的任务参数化学习与动作序列推理相结合的方法.针对通用动作基元的多演示轨迹样本,利用动态运动基元进行轨迹编码并建立任务参数化模型,利用高斯过程回归学习外部参数与模型参数之间的映射.针对新的任务实例,利用规划域定义语言推理缺失动作序列,任务参数化模型根据新的外部参数泛化出动作的目标轨迹,并修正轨迹误差.在UR5机器人上的实验表明,面对不同任务实例和环境变化,该方法可灵活生成动作序列并调整泛化目标,基于多演示的任务参数化模型能够对给定外部参数泛化出平滑的目标轨迹,泛化效果优于单一演示轨迹,提高了机器人任务泛化的能力.  相似文献   

3.
双臂机器人系统是当前机器人领域的研究热点,特别是随着单臂机器人在操作能力、控制等方面的局限性不断凸显,最近的研究集中在拥有协调操作能力的冗余双臂机器人.对双臂操作进行分类,然后从双臂协调运动方式、双臂协调控制问题、感知传感器、模仿学习、人机交互五个方面进行分析;综述从运动学、动力学现状入手,分析了双臂协调控制与单臂控制...  相似文献   

4.
提出一种新的人工生命动画方法—模仿学习. 模仿是一种非常有效的掌握运动技能的学习方式. 一项运动技能为无数个相关运动序列的集合. 通过模仿代表性运动序列,将蕴含的局部运动技能泛化,可获得完整的运动技能. 模仿学习以运动相似度匹配和简单--复杂行为方法论为核心,并以进化计算为优化方法. 模仿学习降低进化计算对传统评价函数的依赖,减少评价函数设计时间,提高优化复杂目标的能力,因此提高了制作效率. 基于PhysX仿真平台,本文以人工猫的着陆行为验证了本文方法的有效性,并取得了良好的效果.  相似文献   

5.
具备学习能力是高等动物智能的典型表现特征, 为探明四足动物运动技能学习机理, 本文对四足机器人步 态学习任务进行研究, 复现了四足动物的节律步态学习过程. 近年来, 近端策略优化(PPO)算法作为深度强化学习 的典型代表, 普遍被用于四足机器人步态学习任务, 实验效果较好且仅需较少的超参数. 然而, 在多维输入输出场 景下, 其容易收敛到局部最优点, 表现为四足机器人学习到步态节律信号杂乱且重心震荡严重. 为解决上述问题, 在元学习启发下, 基于元学习具有刻画学习过程高维抽象表征优势, 本文提出了一种融合元学习和PPO思想的元近 端策略优化(MPPO)算法, 该算法可以让四足机器人进化学习到更优步态. 在PyBullet仿真平台上的仿真实验结果表 明, 本文提出的算法可以使四足机器人学会行走运动技能, 且与柔性行动者评价器(SAC)和PPO算法的对比实验显 示, 本文提出的MPPO算法具有步态节律信号更规律、行走速度更快等优势.  相似文献   

6.
迭代学习神经网络控制在机器人示教学习中的应用   总被引:3,自引:0,他引:3       下载免费PDF全文
示教学习是机器人运动技能获取的一种高效手段.当采用摄像机作为示教轨迹记录部件时,示教学习涉及如何通过反复尝试获得未知机器人摄像机模型问题.本文力图针对非线性系统重复作业中的可重复不确定性学习,提出一个迭代学习神经网络控制方案,该控制器将保证系统最大跟踪误差维持在神经网络有效近似域内.为此提出了一个适合于重复作业应用的分布式神经网络结构.该神经网络由沿期望轨线分布的一系列局部神经网络构成,每一局部神经网络对对应期望轨迹点邻域进行近似并通过重复作业完成网络训练.由于所设计的局部神经网络相互独立,因此一个全程轨迹可以通过分段训练完成,由起始段到结束段,逐段实现期望轨迹的准确跟踪.该方法在具有未知机器人摄像机模型的轨迹示教模仿中得到验证,显示了它是一种高效的训练方法,同时具有一致的误差限界能力.  相似文献   

7.
马旭淼  徐德 《控制与决策》2024,39(5):1409-1423
机器人的应用场景正在不断更新换代,数据量也在日益增长.传统的机器学习方法难以适应动态的环境,而增量学习技术能够模拟人类的学习过程,使机器人能利用旧知识来加快新任务的学习,在不遗忘旧技能的前提下学习新的技能.目前对于机器人增量学习的相关研究仍然较少,对此,主要介绍机器人增量学习研究进展.首先,对增量学习进行简介;其次,从参数和模型的角度出发,将当前机器人增量学习主流方法分为变参数方法、变模型方法、混合方法3类,分别对每一类进行论述,并给出相应的增量学习技术在机器人领域中的应用实例;然后,对机器人增量学习中常用的数据集和评价指标进行介绍;最后,对增量学习未来的发展趋势进行展望.  相似文献   

8.
无奇异间接迭代学习控制及其在机器人运动模仿中的应用   总被引:4,自引:0,他引:4  
针对相当广泛的一类非线性系统有限时间轨迹跟踪问题,提出了间接迭代学习方案. 采用最小二乘算法,根据重复跟踪历史辨识非线性系统的线性化模型.利用一个分段学习方案 可保证学习控制总在有效线性近似区域内进行.探讨了如何在学习过程中避免控制奇异问题, 提出了一种高效的参数修正方法,保证输入耦合矩阵的估计行列式不为零.本文将这一控制方 案应用于未知机器人及摄像机模型下的机器人运动模仿中,而不面临任何奇异问题.这是一个 采用摄像机替代传统程序编写的新的机器人编程方法.  相似文献   

9.
《机器人》2014,(3)
提出一种新的基于非接触观测信息的机器人模仿学习表征与执行的控制图模型.建立可模仿学习的人-机关系,并得出模仿学习前提条件是以系统末端微分运动为基本行为元.提出控制图模型结构和基于视觉观测序列的模型学习方法.提出基于累积和瞬时相关函数的观测序列分割和图结构生成方法,和基于RBF(径向基函数)网络的行为元目标学习方法.通过不同结构和自由度的机器人毛笔绘画和物体抓取模仿学习实例实验,证明了所提出模型在视觉观测信息下能够表征与执行不同层次和类型的行为,具有良好的泛化能力、通用性及实用性.  相似文献   

10.
为了解决扑翼飞行机器人实时控制过程中操作者工作量大、操作较为复杂的难题,实现扑翼飞行机器人的分布式智能控制,提出了基于聚类分析和运动描述语言的扑翼飞行机器人行为规划方法.利用扑翼飞行机器人飞行数据聚类分析的结果,将机器人运动行为进行合理分类.在保证了运动描述语言的基元关系的同时,合理提取了扑翼飞行机器人的行为特征,并针对扑翼飞行机器人绕杆任务定义了4类运动基元.以扑翼飞行机器人和机载陀螺仪搭建了扑翼飞行机器人实验系统.通过直接控制方法和基于运动描述语言的机器人行为规划方法进行了实物实验和仿真实验,实验结果验证了所提方法的可行性和有效性.  相似文献   

11.
In recent years, research on movement primitives has gained increasing popularity. The original goals of movement primitives are based on the desire to have a sufficiently rich and abstract representation for movement generation, which allows for efficient teaching, trial-and-error learning, and generalization of motor skills (Schaal 1999). Thus, motor skills in robots should be acquired in a natural dialog with humans, e.g., by imitation learning and shaping, while skill refinement and generalization should be accomplished autonomously by the robot. Such a scenario resembles the way we teach children and connects to the bigger question of how the human brain accomplishes skill learning. In this paper, we review how a particular computational approach to movement primitives, called dynamic movement primitives, can contribute to learning motor skills. We will address imitation learning, generalization, trial-and-error learning by reinforcement learning, movement recognition, and control based on movement primitives. But we also want to go beyond the standard goals of movement primitives. The stereotypical movement generation with movement primitives entails predicting of sensory events in the environment. Indeed, all the sensory events associated with a movement primitive form an associative skill memory that has the potential of forming a most powerful representation of a complete motor skill.  相似文献   

12.
符号表达的模仿学习是共融机器人提高其智能性的一条便捷、可行的途径,也为解决复杂、多步骤任务的学习问题提供了一个切实可行的解决方案,而对示教轨迹进行自动分割并获取其基本动作是成功应用这种学习方式的前提条件.鉴于此,首先,在介绍符号表示的模仿学习的基础上,分析该种学习方式对自动分割方法的具体要求;然后,按照示教任务先验知识的有无将其分为两大类并详细地介绍每类所含的典型分割方法;最后,对上述轨迹分割方法进行对比分析与总结,并展望示教轨迹自动分割方法未来的发展趋势.  相似文献   

13.
模仿学习一直是人工智能领域的研究热点。模仿学习是一种基于专家示教重建期望策略的方法。近年来,在理论研究中,此方法和强化学习等方法结合,已经取得了重要成果;在实际应用中,尤其是在机器人和其他智能体的复杂环境中,模仿学习取得了很好的效果。主要阐述了模仿学习在机器人学领域的研究与运用。介绍了和模仿学习相关的理论知识;研究了模仿学习的两类主要方法:行为克隆学习方法和逆强化学习方法;对模仿学习的成功应用进行总结;最后,给出当前面对的问题和挑战并且展望未来发展趋势。  相似文献   

14.
An interactive loop between motion recognition and motion generation is a fundamental mechanism for humans and humanoid robots. We have been developing an intelligent framework for motion recognition and generation based on symbolizing motion primitives. The motion primitives are encoded into Hidden Markov Models (HMMs), which we call “motion symbols”. However, to determine the motion primitives to use as training data for the HMMs, this framework requires a manual segmentation of human motions. Essentially, a humanoid robot is expected to participate in daily life and must learn many motion symbols to adapt to various situations. For this use, manual segmentation is cumbersome and impractical for humanoid robots. In this study, we propose a novel approach to segmentation, the Real-time Unsupervised Segmentation (RUS) method, which comprises three phases. In the first phase, short human movements are encoded into feature HMMs. Seamless human motion can be converted to a sequence of these feature HMMs. In the second phase, the causality between the feature HMMs is extracted. The causality data make it possible to predict movement from observation. In the third phase, movements having a large prediction uncertainty are designated as the boundaries of motion primitives. In this way, human whole-body motion can be segmented into a sequence of motion primitives. This paper also describes an application of RUS to AUtonomous Symbolization of motion primitives (AUS). Each derived motion primitive is classified into an HMM for a motion symbol, and parameters of the HMMs are optimized by using the motion primitives as training data in competitive learning. The HMMs are gradually optimized in such a way that the HMMs can abstract similar motion primitives. We tested the RUS and AUS frameworks on captured human whole-body motions and demonstrated the validity of the proposed framework.  相似文献   

15.
Automated Derivation of Primitives for Movement Classification   总被引:6,自引:0,他引:6  
We describe a new method for representing human movement compactly, in terms of a linear super-imposition of simpler movements termed primitives. This method is a part of a larger research project aimed at modeling motor control and imitation using the notion of perceptuo-motor primitives, a basis set of coupled perceptual and motor routines. In our model, the perceptual system is biased by the set of motor behaviors the agent can execute. Thus, an agent can automatically classify observed movements into its executable repertoire. In this paper, we describe a method for automatically deriving a set of primitives directly from human movement data.We used movement data gathered from a psychophysical experiment on human imitation to derive the primitives. The data were first filtered, then segmented, and principal component analysis was applied to the segments. The eigenvectors corresponding to a few of the highest eigenvalues provide us with a basis set of primitives. These are used, through superposition and sequencing, to reconstruct the training movements as well as novel ones. The validation of the method was performed on a humanoid simulation with physical dynamics. The effectiveness of the motion reconstruction was measured through an error metric. We also explored and evaluated a technique of clustering in the space of primitives for generating controllers for executing frequently used movements.  相似文献   

16.
Articulated movements are fundamental in many human and robotic tasks.While humans can learn and generalise arbitrarily long sequences of movements,and particularly can optimise them to ft the constraints and features of their body,robots are often programmed to execute point-to-point precise but fxed patterns.This study proposes a new approach to interpreting and reproducing articulated and complex trajectories as a set of known robot-based primitives.Instead of achieving accurate reproductions,the proposed approach aims at interpreting data in an agent-centred fashion,according to an agent s primitive movements.The method improves the accuracy of a reproduction with an incremental process that seeks frst a rough approximation by capturing the most essential features of a demonstrated trajectory.Observing the discrepancy between the demonstrated and reproduced trajectories,the process then proceeds with incremental decompositions and new searches in sub-optimal parts of the trajectory.The aim is to achieve an agent-centred interpretation and progressive learning that fts in the frst place the robots capability,as opposed to a data-centred decomposition analysis.Tests on both geometric and human generated trajectories reveal that the use of own primitives results in remarkable robustness and generalisation properties of the method.In particular,because trajectories are understood and abstracted by means of agent-optimised primitives,the method has two main features: 1) Reproduced trajectories are general and represent an abstraction of the data.2) The algorithm is capable of reconstructing highly noisy or corrupted data without pre-processing thanks to an implicit and emergent noise suppression and feature detection.This study suggests a novel bio-inspired approach to interpreting,learning and reproducing articulated movements and trajectories.Possible applications include drawing,writing,movement generation,object manipulation,and other tasks where the performance requires human-like interpretation and generalisation capabilities.  相似文献   

17.
It is assumed that future robots must coexist with human beings and behave as their companions. Consequently, the complexities of their tasks would increase. To cope with these complexities, scientists are inclined to adopt the anatomical functions of the brain for the mapping and the navigation in the field of robotics. While admitting the continuous works in improving the brain models and the cognitive mapping for robots’ navigation, we show, in this paper, that learning by imitation leads to a positive effect not only in human behavior but also in the behavior of a multi-robot system. We present the interest of low-level imitation strategy at individual and social levels in the case of robots. Particularly, we show that adding a simple imitation capability to the brain model for building a cognitive map improves the ability of individual cognitive map building and boosts sharing information in an unknown environment. Taking into account the notion of imitative behavior, we also show that the individual discoveries (i.e. goals) could have an effect at the social level and therefore inducing the learning of new behaviors at the individual level. To analyze and validate our hypothesis, a series of experiments has been performed with and without a low-level imitation strategy in the multi-robot system.  相似文献   

18.
Many motor skills in humanoid robotics can be learned using parametrized motor primitives. While successful applications to date have been achieved with imitation learning, most of the interesting motor learning problems are high-dimensional reinforcement learning problems. These problems are often beyond the reach of current reinforcement learning methods. In this paper, we study parametrized policy search methods and apply these to benchmark problems of motor primitive learning in robotics. We show that many well-known parametrized policy search methods can be derived from a general, common framework. This framework yields both policy gradient methods and expectation-maximization (EM) inspired algorithms. We introduce a novel EM-inspired algorithm for policy learning that is particularly well-suited for dynamical system motor primitives. We compare this algorithm, both in simulation and on a real robot, to several well-known parametrized policy search methods such as episodic REINFORCE, ??Vanilla?? Policy Gradients with optimal baselines, episodic Natural Actor Critic, and episodic Reward-Weighted Regression. We show that the proposed method out-performs them on an empirical benchmark of learning dynamical system motor primitives both in simulation and on a real robot. We apply it in the context of motor learning and show that it can learn a complex Ball-in-a-Cup task on a real Barrett WAM? robot arm.  相似文献   

19.
模仿学习是强化学习与监督学习的结合,目标是通过观察专家演示,学习专家策略,从而加速强化学习。通过引入任务相关的额外信息,模仿学习相较于强化学习,可以更快地实现策略优化,为缓解低样本效率问题提供了解决方案。模仿学习已成为解决强化学习问题的一种流行框架,涌现出多种提高学习性能的算法和技术。通过与图形图像学的最新研究成果相结合,模仿学习已经在游戏人工智能(artificial intelligence,AI)、机器人控制和自动驾驶等领域发挥了重要作用。本文围绕模仿学习的年度发展,从行为克隆、逆强化学习、对抗式模仿学习、基于观察量的模仿学习和跨领域模仿学习等多个角度进行深入探讨,介绍了模仿学习在实际应用上的最新情况,比较了国内外研究现状,并展望了该领域未来的发展方向。旨在为研究人员和从业人员提供模仿学习的最新进展,从而为开展工作提供参考与便利。  相似文献   

20.
基于生成对抗网络的模仿学习综述   总被引:1,自引:0,他引:1  
模仿学习研究如何从专家的决策数据中进行学习,以得到接近专家水准的决策模型.同样学习如何决策的强化学习往往只根据环境的评价式反馈进行学习,与之相比,模仿学习能从决策数据中获得更为直接的反馈.它可以分为行为克隆、基于逆向强化学习的模仿学习两类方法.基于逆向强化学习的模仿学习把模仿学习的过程分解成逆向强化学习和强化学习两个子过程,并反复迭代.逆向强化学习用于推导符合专家决策数据的奖赏函数,而强化学习基于该奖赏函数来学习策略.基于生成对抗网络的模仿学习方法从基于逆向强化学习的模仿学习发展而来,其中最早出现且最具代表性的是生成对抗模仿学习方法(Generative Adversarial Imitation Learning,简称GAIL).生成对抗网络由两个相对抗的神经网络构成,分别为判别器和生成器.GAIL的特点是用生成对抗网络框架求解模仿学习问题,其中,判别器的训练过程可类比奖赏函数的学习过程,生成器的训练过程可类比策略的学习过程.与传统模仿学习方法相比,GAIL具有更好的鲁棒性、表征能力和计算效率.因此,它能够处理复杂的大规模问题,并可拓展到实际应用中.然而,GAIL存在着模态崩塌、环境交互样本利用效率低等问题.最近,新的研究工作利用生成对抗网络技术和强化学习技术等分别对这些问题进行改进,并在观察机制、多智能体系统等方面对GAIL进行了拓展.本文先介绍了GAIL的主要思想及其优缺点,然后对GAIL的改进算法进行了归类、分析和对比,最后总结全文并探讨了可能的未来趋势.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号