共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents an innovative investigation on prototyping a digital twin(DT)as the platform for human-robot interactive welding and welder behavior analysis.This humanrobot interaction(HRI)working style helps to enhance human users'operational productivity and comfort;while data-driven welder behavior analysis benefits to further novice welder training.This HRI system includes three modules:1)a human user who demonstrates the welding operations offsite with her/his operations recorded by the motion-tracked handles;2)a robot that executes the demonstrated welding operations to complete the physical welding tasks onsite;3)a DT system that is developed based on virtual reality(VR)as a digital replica of the physical human-robot interactive welding environment.The DT system bridges a human user and robot through a bi-directional information flow:a)transmitting demonstrated welding operations in VR to the robot in the physical environment;b)displaying the physical welding scenes to human users in VR.Compared to existing DT systems reported in the literatures,the developed one provides better capability in engaging human users in interacting with welding scenes,through an augmented VR.To verify the effectiveness,six welders,skilled with certain manual welding training and unskilled without any training,tested the system by completing the same welding job;three skilled welders produce satisfied welded workpieces,while the other three unskilled do not.A data-driven approach as a combination of fast Fourier transform(FFT),principal component analysis(PCA),and support vector machine(SVM)is developed to analyze their behaviors.Given an operation sequence,i.e.,motion speed sequence of the welding torch,frequency features are firstly extracted by FFT and then reduced in dimension through PCA,which are finally routed into SVM for classification.The trained model demonstrates a 94.44%classification accuracy in the testing dataset.The successful pattern recognition in skilled welder operations should benefit to accelerate novice welder training. 相似文献
2.
This paper proposes a novel approach for physical human-robot interactions (pHRI), where a robot provides guidance forces to a user based on the user performance. This framework tunes the forces in regards to behavior of each user in coping with different tasks, where lower performance results in higher intervention from the robot. This personalized physical human-robot interaction (p2HRI) method incorporates adaptive modeling of the interaction between the human and the robot as well as learning from demonstration (LfD) techniques to adapt to the users' performance. This approach is based on model predictive control where the system optimizes the rendered forces by predicting the performance of the user. Moreover, continuous learning of the user behavior is added so that the models and personalized considerations are updated based on the change of user performance over time. Applying this framework to a field such as haptic guidance for skill improvement, allows a more personalized learning experience where the interaction between the robot as the intelligent tutor and the student as the user, is better adjusted based on the skill level of the individual and their gradual improvement. The results suggest that the precision of the model of the interaction is improved using this proposed method, and the addition of the considered personalized factors to a more adaptive strategy for rendering of guidance forces. 相似文献
3.
提出了设计侦察机器人的人机界面时在安全性、专业性、适应性等方面应该注意的问题,分析了人机接口在远程感知、导航控制、侦察控制和事务规划等方面的功能需求,并针对研制防化侦察机器人的需要,设计了具有模拟命令输入面板和监控画面的人机交互操作员控制器,具有命令输入方便、模拟显示形象逼真、临场感好、交互性强的优点. 相似文献
4.
A long history has passed since electromyography (EMG) signals have been explored in human-centered robots for intuitive interaction. However, it still has a gap between scientific research and real-life applications. Previous studies mainly focused on EMG decoding algorithms, leaving a dynamic relationship between the human, robot, and uncertain environment in real-life scenarios seldomly concerned. To fill this gap, this paper presents a comprehensive review of EMG-based techniques in human-robot-environment interaction (HREI) systems. The general processing framework is summarized, and three interaction paradigms, including direct control, sensory feedback, and partial autonomous control, are introduced. EMG-based intention decoding is treated as a module of the proposed paradigms. Five key issues involving precision, stability, user attention, compliance, and environmental awareness in this field are discussed. Several important directions, including EMG decomposition, robust algorithms, HREI dataset, proprioception feedback, reinforcement learning, and embodied intelligence, are proposed to pave the way for future research. To the best of what we know, this is the first time that a review of EMG-based methods in the HREI system is summarized. It provides a novel and broader perspective to improve the practicability of current myoelectric interaction systems, in which factors in human-robot interaction, robot-environment interaction, and state perception by human sensations are considered, which has never been done by previous studies. 相似文献
5.
逆向强化学习(inverse reinforcement learning, IRL)也称为逆向最优控制(inverse optimal control, IOC),是强化学习和模仿学习领域的一种重要研究方法,该方法通过专家样本求解奖赏函数,并根据所得奖赏函数求解最优策略,以达到模仿专家策略的目的.近年来,逆向强化学习在模仿学习领域取得了丰富的研究成果,已广泛应用于汽车导航、路径推荐和机器人最优控制等问题中.首先介绍逆向强化学习理论基础,然后从奖赏函数构建方式出发,讨论分析基于线性奖赏函数和非线性奖赏函数的逆向强化学习算法,包括最大边际逆向强化学习算法、最大熵逆向强化学习算法、最大熵深度逆向强化学习算法和生成对抗模仿学习等.随后从逆向强化学习领域的前沿研究方向进行综述,比较和分析该领域代表性算法,包括状态动作信息不完全逆向强化学习、多智能体逆向强化学习、示范样本非最优逆向强化学习和指导逆向强化学习等.最后总结分析当前存在的关键问题,并从理论和应用方面探讨未来的发展方向. 相似文献
6.
Human-robot interaction (HRI) is fundamental for human-centered robotics, and has been attracting intensive research for more than a decade. The series elastic actuator (SEA) provides inherent compliance, safety and further benefits for HRI, but the introduced elastic element also brings control difficulties. In this paper, we address the stiffness rendering problem for a cable-driven SEA system, to achieve either low stiffness for good transparency or high stiffness bigger than the physical spring constant, and to assess the rendering accuracy with quantified metrics. By taking a velocity-sourced model of the motor, a cascaded velocity-torque-impedance control structure is established. To achieve high fidelity torque control, the 2-DOF (degree of freedom) stabilizing control method together with a compensator has been used to handle the competing requirements on tracking performance, noise and disturbance rejection, and energy optimization in the cable-driven SEA system. The conventional passivity requirement for HRI usually leads to a conservative design of the impedance controller, and the rendered stiffness cannot go higher than the physical spring constant. By adding a phase-lead compensator into the impedance controller, the stiffness rendering capability was augmented with guaranteed relaxed passivity. Extensive simulations and experiments have been performed, and the virtual stiffness has been rendered in the extended range of 0.1 to 2.0 times of the physical spring constant with guaranteed relaxed passivity for physical humanrobot interaction below 5 Hz. Quantified metrics also verified good rendering accuracy. 相似文献
7.
Conventional robot control schemes are basically model-based methods. However, exact modeling of robot dynamics poses considerable problems and faces various uncertainties in task execution. This paper proposes a reinforcement learning control approach for overcoming such drawbacks. An artificial neural network (ANN) serves as the learning structure, and an applied stochastic real-valued (SRV) unit as the learning method. Initially, force tracking control of a two-link robot arm is simulated to verify the control design. The simulation results confirm that even without information related to the robot dynamic model and environment states, operation rules for simultaneous controlling force and velocity are achievable by repetitive exploration. Hitherto, however, an acceptable performance has demanded many learning iterations and the learning speed proved too slow for practical applications. The approach herein, therefore, improves the tracking performance by combining a conventional controller with a reinforcement learning strategy. Experimental results demonstrate improved trajectory tracking performance of a two-link direct-drive robot manipulator using the proposed method. 相似文献
8.
Aiming at the tracking problem of a class of discrete nonaffine nonlinear multi-input multi-output (MIMO) repetitive systems subjected to separable and nonseparable disturbances, a novel data-driven iterative learning control (ILC) scheme based on the zeroing neural networks (ZNNs) is proposed. First, the equivalent dynamic linearization data model is obtained by means of dynamic linearization technology, which exists theoretically in the iteration domain. Then, the iterative extended state observer (IESO) is developed to estimate the disturbance and the coupling between systems, and the decoupled dynamic linearization model is obtained for the purpose of controller synthesis. To solve the zero-seeking tracking problem with inherent tolerance of noise, an ILC based on noise-tolerant modified ZNN is proposed. The strict assumptions imposed on the initialization conditions of each iteration in the existing ILC methods can be absolutely removed with our method. In addition, theoretical analysis indicates that the modified ZNN can converge to the exact solution of the zero-seeking tracking problem. Finally, a generalized example and an application-oriented example are presented to verify the effectiveness and superiority of the proposed process. 相似文献
9.
A facial expression emotion recognition based human-robot interaction (FEER-HRI) system is proposed, for which a four-layer system framework is designed. The FEERHRI system enables the robots not only to recognize human emotions, but also to generate facial expression for adapting to human emotions. A facial emotion recognition method based on 2D-Gabor, uniform local binary pattern (LBP) operator, and multiclass extreme learning machine (ELM) classifier is presented, which is applied to real-time facial expression recognition for robots. Facial expressions of robots are represented by simple cartoon symbols and displayed by a LED screen equipped in the robots, which can be easily understood by human. Four scenarios, i.e., guiding, entertainment, home service and scene simulation are performed in the human-robot interaction experiment, in which smooth communication is realized by facial expression recognition of humans and facial expression generation of robots within 2 seconds. As a few prospective applications, the FEERHRI system can be applied in home service, smart home, safe driving, and so on. 相似文献
10.
Reinforcement learning (RL) has roots in dynamic programming and it is called adaptive/approximate dynamic programming (ADP) within the control community. This paper reviews recent developments in ADP along with RL and its applications to various advanced control fields. First, the background of the development of ADP is described, emphasizing the significance of regulation and tracking control problems. Some effective offline and online algorithms for ADP/adaptive critic control are displayed, where the main results towards discrete-time systems and continuous-time systems are surveyed, respectively. Then, the research progress on adaptive critic control based on the event-triggered framework and under uncertain environment is discussed, respectively, where event-based design, robust stabilization, and game design are reviewed. Moreover, the extensions of ADP for addressing control problems under complex environment attract enormous attention. The ADP architecture is revisited under the perspective of data-driven and RL frameworks, showing how they promote ADP formulation significantly. Finally, several typical control applications with respect to RL and ADP are summarized, particularly in the fields of wastewater treatment processes and power systems, followed by some general prospects for future research. Overall, the comprehensive survey on ADP and RL for advanced control applications has demonstrated its remarkable potential within the artificial intelligence era. In addition, it also plays a vital role in promoting environmental protection and industrial intelligence. 相似文献
11.
In this paper,a data-driven conflict-aware safe reinforcement learning(CAS-RL)algorithm is presented for control of autonomous systems.Existing safe RL results with predefined performance functions and safe sets can only provide safety and performance guarantees for a single environment or circumstance.By contrast,the presented CAS-RL algorithm provides safety and performance guarantees across a variety of circumstances that the system might encounter.This is achieved by utilizing a bilevel learning control architecture:A higher metacognitive layer leverages a data-driven receding-horizon attentional controller(RHAC)to adapt relative attention to different system’s safety and performance requirements,and,a lower-layer RL controller designs control actuation signals for the system.The presented RHAC makes its meta decisions based on the reaction curve of the lower-layer RL controller using a metamodel or knowledge.More specifically,it leverages a prediction meta-model(PMM)which spans the space of all future meta trajectories using a given finite number of past meta trajectories.RHAC will adapt the system’s aspiration towards performance metrics(e.g.,performance weights)as well as safety boundaries to resolve conflicts that arise as mission scenarios develop.This will guarantee safety and feasibility(i.e.,performance boundness)of the lower-layer RL-based control solution.It is shown that the interplay between the RHAC and the lower-layer RL controller is a bilevel optimization problem for which the leader(RHAC)operates at a lower rate than the follower(RL-based controller)and its solution guarantees feasibility and safety of the control solution.The effectiveness of the proposed framework is verified through a simulation example. 相似文献
12.
强化学习(Reinforcement learning, RL)在围棋、视频游戏、导航、推荐系统等领域均取得了巨大成功. 然而, 许多强化学习算法仍然无法直接移植到真实物理环境中. 这是因为在模拟场景下智能体能以不断试错的方式与环境进行交互, 从而学习最优策略. 但考虑到安全因素, 很多现实世界的应用则要求限制智能体的随机探索行为. 因此, 安全问题成为强化学习从模拟到现实的一个重要挑战. 近年来, 许多研究致力于开发安全强化学习(Safe reinforcement learning, SRL)算法, 在确保系统性能的同时满足安全约束. 本文对现有的安全强化学习算法进行全面综述, 将其归为三类: 修改学习过程、修改学习目标、离线强化学习, 并介绍了5大基准测试平台: Safety Gym、safe-control-gym、SafeRL-Kit、D4RL、NeoRL. 最后总结了安全强化学习在自动驾驶、机器人控制、工业过程控制、电力系统优化和医疗健康领域中的应用, 并给出结论与展望. 相似文献
13.
组合最优化问题(COP)的求解方法已经渗透到人工智能、运筹学等众多领域.随着数据规模的不断增大、问题更新速度的变快,运用传统方法求解COP问题在速度、精度、泛化能力等方面受到很大冲击.近年来,强化学习(RL)在无人驾驶、工业自动化等领域的广泛应用,显示出强大的决策力和学习能力,故而诸多研究者尝试使用RL求解COP问题,... 相似文献
14.
两轮机器人是一个典型的不稳定,非线性,强耦合的自平衡系统,在两轮机器人系统模型未知和没有先验经验的条件下,将强化学习算法和模糊神经网络有效结合,保证了函数逼近的快速性和收敛性,成功地实现两轮机器人的自学习平衡控制,并解决了两轮机器人连续状态空间和动作空间的强化学习问题;仿真和实验表明:该方法不仅在很短的时间内成功地完成对两轮机器人的平衡控制,而且在两轮机器人参数变化较大时,仍能维持两轮机器人的平衡。 相似文献
15.
In this paper, we propose a set of algorithms to design signal timing plans via deep reinforcement learning. The core idea of this approach is to set up a deep neural network (DNN) to learn the Q-function of reinforcement learning from the sampled traffic state/control inputs and the corresponding traffic system performance output. Based on the obtained DNN, we can find the appropriate signal timing policies by implicitly modeling the control actions and the change of system states. We explain the possible benefits and implementation tricks of this new approach. The relationships between this new approach and some existing approaches are also carefully discussed. 相似文献
16.
运行指标决策问题是实现工业过程运行安全和生产指标优化的关键. 考虑到多运行指标决策问题求解的复杂性和工业过程生产条件动态波动引发生产指标状态的不确定性, 提出了一种策略异步更新强化学习算法自学习决策运行指标, 并给出算法收敛性的理论证明. 该算法在随机自适应动态规划框架下, 利用样本均值代替计算生产指标状态转移概率矩阵, 因此无需要求生产指标状态转移概率矩阵已知. 并且通过引入时钟和定义其阈值, 采用集中式策略评估、多策略异步更新方式用以简化求解多运行指标决策问题, 提高强化学习的学习效率. 利用可测量数据, 自学习得到的运行指标能够保证生产指标优化, 并且限制在规定范围之内. 最后, 采用中国西部某大型选矿厂的实际数据进行仿真验证, 表明该方法的有效性. 相似文献
17.
复杂过程工业控制一直是控制应用领域研究的前沿问题. 浓密机作为一种复杂大型工业设备广泛用于冶金、采矿等领域. 由于其在运行过程中具有多变量、非线性、高时滞等特点, 浓密机的底流浓度控制技术一直是学界、工业界的研究难点与热点. 本文提出了一种基于强化学习技术的浓密机在线控制算法. 该算法在传统启发式动态规划 (Heuristic dynamic programming, HDP)算法的基础上, 设计融合了评价网络与模型网络的双网结构, 并提出了基于短期经验回放的方法用于增强评价网络的训练准确性, 实现了对浓密机底流浓度的稳定控制, 并保持控制输入稳定在设定范围之内. 最后, 通过浓密机仿真实验的方式验证了算法的有效性, 实验结果表明本文提出的方法在时间消耗、控制精度上优于其他算法. 相似文献
18.
人机自然交互需要情感模型。在情绪状态自发转移马尔科夫模型中,针对参数的调整能否给个体情绪差异带来影响、带来影响有多大等有关个体情绪区分聚类的问题,提出基于度量多元尺度分析理论的个体人工情绪差异性研究方法。通过不相似度矩阵计算内积矩阵,再应用主成分因素分析法,便可得到个体属性重构矩阵,在低维上展现个体情绪差异。实验结果可用来指导模型参数的选取,并且,对此结果的一部分也进行数学验证。 相似文献
19.
近年来,基于环境交互的强化学习方法在机器人相关应用领域取得巨大成功,为机器人行为控制策略优化提供一个现实可行的解决方案.但在真实世界中收集交互样本存在高成本以及低效率等问题,因此仿真环境被广泛应用于机器人强化学习训练过程中.通过在虚拟仿真环境中以较低成本获取大量训练样本进行策略训练,并将学习策略迁移至真实环境,能有效缓解真实机器人训练中存在的安全性、可靠性以及实时性等问题.然而,由于仿真环境与真实环境存在差异,仿真环境中训练得到的策略直接迁移到真实机器人往往难以获得理想的性能表现.针对这一问题,虚实迁移强化学习方法被提出用以缩小环境差异,进而实现有效的策略迁移.按照迁移强化学习过程中信息的流动方向和智能化方法作用的不同对象,提出一个虚实迁移强化学习系统的流程框架,并基于此框架将现有相关工作分为3大类:基于真实环境的模型优化方法、基于仿真环境的知识迁移方法、基于虚实环境的策略迭代提升方法,并对每一分类中的代表技术与关联工作进行阐述.最后,讨论虚实迁移强化学习研究领域面临的机遇和挑战. 相似文献
20.
设计了一种基于混合视线-脑机接口与共享控制的人-机器人交互系统,以使得用户可通过视线和意念对机器人末端在2维空间进行连续的运动控制,并在避障和趋近目标的任务中获得机器智能的辅助.首先,按照用户运动意念的强度对机器人末端的运动速度大小进行等比例连续调节,以提高用户对机器人的控制感以及完成任务的参与性.然后,提出了机器人末端运动方向的一种共享控制策略,动态地融合基于视线追踪技术所得到的用户方向控制指令以及由机器人避障和趋近目标的行为设定所得到的机器人系统方向控制指令,自适应地调整机器人系统对用户的辅助力度,以减轻用户脑力负荷,提高任务完成成功率.最后,针对搭建的基于混合视线-脑机接口和共享控制的人-机器人交互平台,通过实验验证了所提系统的有效性. 相似文献
|