期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

基于迭代学习的机械手操作空间力/位置混合控制算法 总被引：1，自引：0，他引：1

韦庆常文森张彭《自动化学报》1997,23(4):468-474

基于对常规机械手操作空间力/位置混合控制算法的简单回顾,及对该算法所遇到困难的分析,提出了一种基于迭代学习的机械手操作空间力/位置混合控制算法,来改善机械手同高刚度环境接触时,机械手力/位置混合控制的动态控制性能．给出了学习算法的收敛条件及其证明.实验表明该算法具有快速的收敛性,能达到很高的力/位置动态控制精度. 相似文献

2.

Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning 总被引：1，自引：0，他引：1

Asada Minoru Noda Shoichi Tawaratsumida Sukoya Hosoda Koh 《Machine Learning》1996,23(2-3):279-303

This paper presents a method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal. We discuss several issues in applying the reinforcement learning method to a real robot with vision sensor by which the robot can obtain information about the changes in an environment. First, we construct a state space in terms of size, position, and orientation of a ball and a goal in an image, and an action space is designed in terms of the action commands to be sent to the left and right motors of a mobile robot. This causes a state-action deviation problem in constructing the state and action spaces that reflect the outputs from physical sensors and actuators, respectively. To deal with this issue, an action set is constructed in a way that one action consists of a series of the same action primitive which is successively executed until the current state changes. Next, to speed up the learning time, a mechanism of Learning from Easy Missions (or LEM) is implemented. LEM reduces the learning time from exponential to almost linear order in the size of the state space. The results of computer simulations and real robot experiments are given. 相似文献

3.

Reinforcement Learning and Robust Control for Robot Compliance Tasks

Cheng-Peng Kuan Kuu-young Young 《Journal of Intelligent and Robotic Systems》1998,23(2-4):165-182

The complexity in planning and control of robot compliance tasks mainly results from simultaneous control of both position and force and inevitable contact with environments. It is quite difficult to achieve accurate modeling of the interaction between the robot and the environment during contact. In addition, the interaction with the environment varies even for compliance tasks of the same kind. To deal with these phenomena, in this paper, we propose a reinforcement learning and robust control scheme for robot compliance tasks. A reinforcement learning mechanism is used to tackle variations among compliance tasks of the same kind. A robust compliance controller that guarantees system stability in the presence of modeling uncertainties and external disturbances is used to execute control commands sent from the reinforcement learning mechanism. Simulations based on deburring compliance tasks demonstrate the effectiveness of the proposed scheme. 相似文献

4.

基于迭代学习的机械手操作空间力/位置混合控制算法 总被引：4，自引：0，他引：4

韦庆常文森张彭《自动化学报》1997,(4)

基于对常规机械手操作空间力／位置混合控制算法的简单回顾，及对该算法所遇到困难的分析，提出了一种基于迭代学习的机械手操作空间力／位置混合控制算法，来改善机械手同高刚度环境接触时，机械手力／位置混合控制的动态控制性能．给出了学习算法的收敛条件及其证明．实验表明该算法具有快速的收敛性，能达到很高的力／位置动态控制精度．相似文献

5.

A neural network-based approach for an assembly cell control

Y. Touati Y. Amirat N. Saadia A. Ali-Chrif 《Applied Soft Computing》2008,8(4):1335-1343

In several robotics applications, the robot must interact with the workspace, and thus its motion is constrained by the task. In this case, pure position control will be ineffective since forces appearing during the contacts must also be controlled. However, simultaneous position and force control called hybrid control is then required. Moreover, the nonlinear plant dynamics, the complexity of the dynamic parameters determination and computation constraints makes more difficult the synthesis of control laws. In order to satisfy all these constraints, an effective hybrid force/position approach based on artificial neural networks for multi-inputs/multi-outputs systems is proposed. This approach realizes, simultaneously, an identification and control of systems, and it is implemented according to two phases: At first, a neural observer is trained off-line on the basis of the data acquired during contact motion, in order to realize a smooth transition from free to contact motion. Then, an online learning of the neural controller is implemented using neural observer parameters so that the closed-loop system maintains a good performance and compensates for uncertain/unknown dynamics of the robot and the environment. A typical example on which we shall focus is an assembly task. Experimental results on a C5 links parallel robot demonstrate that the robot's skill improves effectively and the force control performances are satisfactory, even if the dynamics of the robot and the environment change. 相似文献

6.

基于未知环境状态新定义及知识启发的机器人导航Q学习算法

童小龙姚明海张灿淋《计算机系统应用》2014,23(1):149-153

由于强大的自主学习能力, 强化学习方法逐渐成为机器人导航问题的研究热点, 但是复杂的未知环境对算法的运行效率和收敛速度提出了考验。提出一种新的机器人导航Q学习算法, 首先用三个离散的变量来定义环境状态空间, 然后分别设计了两部分奖赏函数, 结合对导航达到目标有利的知识来启发引导机器人的学习过程。实验在Simbad仿真平台上进行, 结果表明本文提出的算法很好地完成了机器人在未知环境中的导航任务, 收敛性能也有其优越性。相似文献

7.

Robot Control Optimization Using Reinforcement Learning

Kai-Tai Song Wen-Yu Sun 《Journal of Intelligent and Robotic Systems》1998,21(3):221-238

Conventional robot control schemes are basically model-based methods. However, exact modeling of robot dynamics poses considerable problems and faces various uncertainties in task execution. This paper proposes a reinforcement learning control approach for overcoming such drawbacks. An artificial neural network (ANN) serves as the learning structure, and an applied stochastic real-valued (SRV) unit as the learning method. Initially, force tracking control of a two-link robot arm is simulated to verify the control design. The simulation results confirm that even without information related to the robot dynamic model and environment states, operation rules for simultaneous controlling force and velocity are achievable by repetitive exploration. Hitherto, however, an acceptable performance has demanded many learning iterations and the learning speed proved too slow for practical applications. The approach herein, therefore, improves the tracking performance by combining a conventional controller with a reinforcement learning strategy. Experimental results demonstrate improved trajectory tracking performance of a two-link direct-drive robot manipulator using the proposed method. 相似文献

8.

改进深度强化学习的室内移动机器人路径规划

下载免费PDF全文

成怡郝密密《计算机工程与应用》2021,57(21):256-262

为了解决传统深度强化学习在室内未知环境下移动机器人路径规划中存在探索能力差和环境状态空间奖励稀疏的问题,提出了一种基于深度图像信息的改进深度强化学习算法。利用Kinect视觉传感器直接获取的深度图像信息和目标位置信息作为网络的输入,以机器人的线速度和角速度作为下一步动作指令的输出。设计了改进的奖惩函数,提高了算法的奖励值,优化了状态空间,在一定程度上缓解了奖励稀疏的问题。仿真结果表明,改进算法提高了机器人的探索能力,优化了路径轨迹,使机器人有效地避开了障碍物,规划出更短的路径,简单环境下比DQN算法的平均路径长度缩短了21.4%,复杂环境下平均路径长度缩短了11.3%。相似文献

9.

基于深度强化学习的机器人运动控制研究进展

董豪杨静李少波王军段仲静《控制与决策》2022,37(2):278-292

复杂未知环境下智能感知与自动控制是目前机器人在控制领域的研究热点之一,而新一代人工智能为其实现智能自动化赋予了可能.近年来,在高维连续状态-动作空间中,尝试运用深度强化学习进行机器人运动控制的新兴方法受到了相关研究人员的关注.首先,回顾了深度强化学习的兴起与发展,将用于机器人运动控制的深度强化学习算法分为基于值函数和策略梯度2类,并对各自典型算法及其特点进行了详细介绍;其次,针对仿真至现实之前的学习过程,简要介绍5种常用于深度强化学习的机器人运动控制仿真平台;然后,根据研究类型的不同,综述了目前基于深度强化学习的机器人运动控制方法在自主导航、物体抓取、步态控制、人机协作以及群体协同等5个方面的研究进展;最后,对其未来所面临的挑战以及发展趋势进行了总结与展望. 相似文献

10.

Data-efficient learning of robotic clothing assistance using Bayesian Gaussian process latent variable model

Nishanth Koganti Tomohiro Shibata Tomoya Tamei Kazushi Ikeda 《Advanced Robotics》2019,33(15-16):800-814

ABSTRACT

Motor-skill learning for complex robotic tasks is a challenging problem due to the high task variability. Robotic clothing assistance is one such challenging problem that can greatly improve the quality-of-life for the elderly and disabled. In this study, we propose a data-efficient representation to encode task-specific motor-skills of the robot using Bayesian nonparametric latent variable models. The effectivity of the proposed motor-skill representation is demonstrated in two ways: (1) through a real-time controller that can be used as a tool for learning from demonstration to impart novel skills to the robot and (2) by demonstrating that policy search reinforcement learning in such a task-specific latent space outperforms learning in the high-dimensional joint configuration space of the robot. We implement our proposed framework in a practical setting with a dual-arm robot performing clothing assistance tasks. 相似文献

11.

Derivative-free reinforcement learning: a review

Hong QIAN Yang YU 《Frontiers of Computer Science》2021,15(6):156336

Reinforcement learning is about learning agent models that make the best sequential decisions in unknown environments. In an unknown environment, the agent needs to explore the environment while exploiting the collected information, which usually forms a sophisticated problem to solve. Derivative-free optimization, meanwhile, is capable of solving sophisticated problems. It commonly uses a sampling-andupdating framework to iteratively improve the solution, where exploration and exploitation are also needed to be well balanced. Therefore, derivative-free optimization deals with a similar core issue as reinforcement learning, and has been introduced in reinforcement learning approaches, under the names of learning classifier systems and neuroevolution/evolutionary reinforcement learning. Although such methods have been developed for decades, recently, derivative-free reinforcement learning exhibits attracting increasing attention. However, recent survey on this topic is still lacking. In this article, we summarize methods of derivative-free reinforcement learning to date, and organize the methods in aspects including parameter updating, model selection, exploration, and parallel/distributed methods. Moreover, we discuss some current limitations and possible future directions, hoping that this article could bring more attentions to this topic and serve as a catalyst for developing novel and efficient approaches. 相似文献

12.

基于多源信息融合的协作机器人演示编程及优化方法

王斐齐欢周星群王建辉《机器人》2018,40(4):551-559

为解决现有机器人装配学习过程复杂且对编程技术要求高等问题,提出一种基于前臂表面肌电信号和惯性多源信息融合的隐式交互方式来实现机器人演示编程.在通过演示学习获得演示人的装配经验的基础上,为提高对装配对象和环境变化的自适应能力,提出了一种多工深度确定性策略梯度算法（M-DDPG）来修正装配参数,在演示编程的基础上,进行强化学习确保机器人稳定执行任务.在演示编程实验中,提出一种改进的PCNN（并行卷积神经网络）,称作1维PCNN（1D-PCNN）,即通过1维的卷积与池化过程自动提取惯性信息与肌电信息特征,增强了手势识别的泛化性和准确率;在演示再现实验中,采用高斯混合模型（GMM）对演示数据进行统计编码,利用高斯混合回归（GMR）方法实现机器人轨迹动作再现,消除噪声点.最后,基于Primesense Carmine摄像机采用帧差法与多特征图核相关滤波算法（MKCF）的融合跟踪算法分别获取X轴与Y轴方向的环境变化,采用2个相同的网络结构并行进行连续过程的深度强化学习.在轴孔相对位置变化的情况下,机械臂能根据强化学习得到的泛化策略模型自动对机械臂末端位置进行调整,实现轴孔装配的演示学习. 相似文献

13.

一种基于混合学习策略的移动机器人路径规划方法

郜园园阮晓钢宋洪军于建均《控制与决策》2012,27(12):1822-1827

针对未知环境下移动机器人路径规划问题,以操作条件反射学习机制为基础,根据模糊推理系统和学习自动机的原理,提出一种应用于移动机器人导航的混合学习策略.运用仿生的自组织学习方法,通过不断与外界未知环境交互从而使机器人具有自学习和自适应的功能.仿真结果表明,该方法能使机器人学会避障和目标导航任务,与传统的人工势场法相比,能有效地克服局部极小和振荡情况. 相似文献

14.

一种基于行为控制的两自由度机械臂智能控制器

叶剑乔俊飞李明爱阮晓钢《控制理论与应用》2007,24(3):440-444

基于行为的控制方法相对于传统的控制方法在解决未知环境中的机器人中有着更好的鲁棒性和实时性.本文提出了一种基于反应式行为控制的智能控制器,以强化学习作为智能控制器的学习算法.通过采用评价-控制模型,该智能控制器能够不依赖于系统模型,通过连续地在线学习得到机器人的行为.将该智能控制器应用到两自由度仿真机械臂的控制中,仿真结果表明该智能控制器可以实现对两自由度机械臂的连续控制,使其能够迅速达到目标位置. 相似文献

15.

基于Spark的分布式机器人强化学习训练框架

下载免费PDF全文

方伟黄增强徐建斌黄羿马新强《图学学报》2019,40(5):852

强化学习能够通过自主学习的方式对机器人难以利用控制方法实现的各种任务进行训练完成,有效避免了系统设计人员对系统建模或制定规则。然而,强化学习在机器人开发应用领域中训练成本高昂,需要花费大量时间成本、硬件成本实现学习训练,虽然基于仿真可以一定程度减少硬件成本,但对类似 Gazebo 这样的复杂机器人训练平台,仿真过程工作效率低,数据采样耗时长。为了有效解决这些问题,针对机器人仿真过程的平台易用性、兼容性等方面进行优化,提出一种基于 Spark 的分布式强化学习框架,为强化学习的训练与机器人仿真采样提供分布式支持,具有高兼容性、健壮性的特性。通过实验数据分析对比,表明本系统框架不仅可有效提高机器人的强化学习模型训练速度,缩短训练时间花费,且有助于节约硬件成本。相似文献

16.

面向轨迹规划的深度强化学习奖励函数设计

下载免费PDF全文

李跃邵振洲赵振东施智平关永《计算机工程与应用》2020,56(2):226-232

现有基于深度强化学习的机械臂轨迹规划方法在未知环境中学习效率偏低,规划策略鲁棒性差。为了解决上述问题,提出了一种基于新型方位奖励函数的机械臂轨迹规划方法A-DPPO,基于相对方向和相对位置设计了一种新型方位奖励函数,通过降低无效探索,提高学习效率。将分布式近似策略优化（DPPO）首次用于机械臂轨迹规划,提高了规划策略的鲁棒性。实验证明相比现有方法,A-DPPO有效地提升了学习效率和规划策略的鲁棒性。相似文献

17.

基于深度强化学习的移动机器人导航策略研究

下载免费PDF全文

江其洲曾碧《计算机测量与控制》2019,27(8):217-221

针对移动机器人在复杂动态变化的环境下导航的局限性,采用了一种将深度学习和强化学习结合起来的深度强化学习方法。研究以在OpenCV平台下搭建的仿真环境的图像作为输入数据,输入至TensorFlow创建的卷积神经网络模型中处理,提取其中的机器人的动作状态信息,结合强化学习的决策能力求出最佳导航策略。仿真实验结果表明：在经过深度强化学习的方法训练后,移动机器人在环境发生了部分场景变化时,依然能够实现随机起点到随机终点的高效准确的导航。相似文献

18.

一种基于强化学习的多指手位置控制方法

马志峰王从庆《计算机测量与控制》2006,14(7):896-899

为解决多指手模型复杂和模型参数不确定问题，针对一类不确定系统，给出了一种基于强化学习的控制方法，该方法将反馈控制与强化学习相结合；在多指手的控制中，反馈控制使关节和被抓持的物体运动跟踪期望轨迹，强化学习控制使关节和被抓持的物体运动逼近期望轨迹；仿真结果表明，该方法能克服多指手动力学模型的不确定性和黏性滑动摩擦力造成的影响，具有良好的鲁棒性，提高了控制性能。相似文献

19.

基于目标的域随机化方法在机器人操作方面的研究

张夏禹陈小平《计算机应用研究》2022,39(10)

使用强化学习解决机器人操作问题有着诸多优势,然而传统的强化学习算法面临着奖励稀疏的困难,且得到的策略难以直接应用到现实环境中。为了提高策略从仿真到现实迁移的成功率,提出了基于目标的域随机化方法：使用基于目标的强化学习算法对模型进行训练,可以有效地应对机器人操作任务奖励稀疏的情况,得到的策略可以在仿真环境下良好运行,与此同时在算法中还使用了目标驱动的域随机化方法,在提高策略泛用性以及克服仿真和现实环境之间的差距上有着良好的效果,仿真环境下的策略容易迁移到现实环境中并成功执行。结果表明,使用了基于目标的域随机化方法的强化学习算法有助于提高策略从仿真到现实迁移的成功率。相似文献

20.

一类基于启发式搜索的激励学习算法

唐中勇付强卓佳陈焕文《微机发展》2006,16(8):41-43

激励学习已被证明是在控制领域中一种可行的新方法。相比其他的方法,它能较好地处理未知环境问题,但它仍然不是一种有效的方法。幸运的是,在现实世界中,智能体总是会有一些环境的先验知识,这些能形成启发式信息。启发式搜索是一种常用的搜索方法,有很快的搜索速度,但需要精确的启发式信息,这在有些时候难以得到。文中分析比较了启发式搜索和激励学习的各自特点,提出一类新的基于启发式搜索的激励学习算法,初步的实验结果显示了较好的性能。相似文献