共查询到20条相似文献,搜索用时 15 毫秒
1.
In cognitive science, artificial intelligence, psychology, and education, a growing body of research supports the view that the learning process is strongly influenced by the learner's goals. The fundamental tenet ofgoal-driven learning is that learning is largely an active and strategic process in which the learner, human or machine, attempts to identify and satisfy its information needs in the context of its tasks and goals, its prior knowledge, its capabilities, and environmental opportunities for learning. This article examines the motivations for adopting a goal-driven model of learning, the relationship between task goals and learning goals, the influences goals can have on learning, and the pragmatic implications of the goal-driven learning model. It presents a new integrative framework for understanding the goal-driven learning process and applies this framework to characterizing research on goal-driven learning. 相似文献
2.
示教学习是机器人运动技能获取的一种高效手段.当采用摄像机作为示教轨迹记录部件时,示教学习涉及如何通过反复尝试获得未知机器人摄像机模型问题.本文力图针对非线性系统重复作业中的可重复不确定性学习,提出一个迭代学习神经网络控制方案,该控制器将保证系统最大跟踪误差维持在神经网络有效近似域内.为此提出了一个适合于重复作业应用的分布式神经网络结构.该神经网络由沿期望轨线分布的一系列局部神经网络构成,每一局部神经网络对对应期望轨迹点邻域进行近似并通过重复作业完成网络训练.由于所设计的局部神经网络相互独立,因此一个全程轨迹可以通过分段训练完成,由起始段到结束段,逐段实现期望轨迹的准确跟踪.该方法在具有未知机器人摄像机模型的轨迹示教模仿中得到验证,显示了它是一种高效的训练方法,同时具有一致的误差限界能力. 相似文献
3.
Brenna D. Argall Sonia Chernova Manuela Veloso Brett Browning 《Robotics and Autonomous Systems》2009,57(5):469-483
We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a structure in which to categorize LfD research. Specifically, we analyze and categorize the multiple ways in which examples are gathered, ranging from teleoperation to imitation, as well as the various techniques for policy derivation, including matching functions, dynamics models and plans. To conclude we discuss LfD limitations and related promising areas for future research. 相似文献
4.
Bidan Huang Miao Li Ravin Luis De Souza Joanna J. Bryson Aude Billard 《Autonomous Robots》2016,40(5):903-927
Object manipulation is a challenging task for robotics, as the physics involved in object interaction is complex and hard to express analytically. Here we introduce a modular approach for learning a manipulation strategy from human demonstration. Firstly we record a human performing a task that requires an adaptive control strategy in different conditions, i.e. different task contexts. We then perform modular decomposition of the control strategy, using phases of the recorded actions to guide segmentation. Each module represents a part of the strategy, encoded as a pair of forward and inverse models. All modules contribute to the final control policy; their recommendations are integrated via a system of weighting based on their own estimated error in the current task context. We validate our approach by demonstrating it, both in a simulation for clarity, and on a real robot platform to demonstrate robustness and capacity to generalise. The robot task is opening bottle caps. We show that our approach can modularize an adaptive control strategy and generate appropriate motor commands for the robot to accomplish the complete task, even for novel bottles. 相似文献
5.
Journal of Intelligent Manufacturing - Robot learning from demonstration (LfD) emerges as a promising solution to transfer human motion to the robot. However, because of the open-loop between the... 相似文献
6.
The main goals of this research are: (i) to explore the influence that announcement of certain type of online assessment has on students' learning strategies and (ii) to explore the influence of stimulated learning strategies on achievement levels that students exhibit during assessments. Research has been conducted by testing and surveying 351 students from higher education institutions. Results indicate that students' learning strategies can be influenced in a relatively short period of time by announcing various types of online assessments and that steering to more desirable deep learning strategies has positive impact on both formal and perceived levels of success in achieving the desired learning goals. These findings can be used to create a novel adaptive online assessment system that incorporates the elements of adaptivity within a series of assessments and uses post-assessment feedback to steer students’ learning strategies. 相似文献
7.
In this paper, we present a visual servoing method based on a learned mapping between feature space and control space. Using a suitable recognition algorithm, we present and evaluate a complete method that simultaneously learns the appearance and control of a low-cost robotic arm. The recognition part is trained using an action precedes perception approach. The novelty of this paper, apart from the visual servoing method per se, is the combination of visual servoing with gripper recognition. We show that we can achieve high precision positioning without knowing in advance what the robotic arm looks like or how it is controlled. 相似文献
8.
This article presents a powerful new algorithm for reinforcement learning in problems where the goals and also the environment may change. The algorithm is completely goal independent, allowing the mechanics of the environment to be learned independently of the task that is being undertaken. Conventional reinforcement learning techniques, such as Q‐learning, are goal dependent. When the goal or reward conditions change, previous learning interferes with the new task that is being learned, resulting in very poor performance. Previously, the Concurrent Q‐Learning algorithm was developed, based on Watkin's Q‐learning, which learns the relative proximity of all states simultaneously. This learning is completely independent of the reward experienced at those states and, through a simple action selection strategy, may be applied to any given reward structure. Here it is shown that the extra information obtained may be used to replace the eligibility traces of Watkin's Q‐learning, allowing many more value updates to be made at each time step. The new algorithm is compared to the previous version and also to DG‐learning in tasks involving changing goals and environments. The new algorithm is shown to perform significantly better than these alternatives, especially in situations involving novel obstructions. The algorithm adapts quickly and intelligently to changes in both the environment and reward structure, and does not suffer interference from training undertaken prior to those changes. © 2005 Wiley Periodicals, Inc. Int J Int Syst 20: 1037–1052, 2005. 相似文献
9.
This paper proposes an end-to-end learning from demonstration framework for teaching force-based manipulation tasks to robots. The strengths of this work are manyfold. First, we deal with the problem of learning through force perceptions exclusively. Second, we propose to exploit haptic feedback both as a means for improving teacher demonstrations and as a human–robot interaction tool, establishing a bidirectional communication channel between the teacher and the robot, in contrast to the works using kinesthetic teaching. Third, we address the well-known what to imitate? problem from a different point of view, based on the mutual information between perceptions and actions. Lastly, the teacher’s demonstrations are encoded using a Hidden Markov Model, and the robot execution phase is developed by implementing a modified version of Gaussian Mixture Regression that uses implicit temporal information from the probabilistic model, needed when tackling tasks with ambiguous perceptions. Experimental results show that the robot is able to learn and reproduce two different manipulation tasks, with a performance comparable to the teacher’s one. 相似文献
10.
11.
12.
Autonomous Robots - This paper presents a learning-based method that uses simulation data to learn an object manipulation task using two model-free reinforcement learning (RL) algorithms. The... 相似文献
13.
Ambient systems are populated by many heterogeneous devices to provide adequate services to their users. The adaptation of an ambient system to the specific needs of its users is a challenging task. Because human–system interaction has to be as natural as possible, we propose an approach based on Learning from Demonstration (LfD). LfD is an interesting approach to generalize what has been observed during the demonstration to similar situations. However, using LfD in ambient systems needs adaptivity of the learning technique. We present ALEX, a multi-agent system able to dynamically learn and reuse contexts from demonstrations performed by a tutor. The results of the experiments performed on both a real and a virtual robot show interesting properties of our technology for ambient applications. 相似文献
14.
15.
Tamara Sumner Faisal Ahmad Sonal Bhushan Qianyi Gu Francis Molina Stedman Willard Michael Wright Lynne Davis Greg Janée 《International Journal on Digital Libraries》2005,5(1):18-24
Concept browsing interfaces can help educators and learners to locate and use learning resources that are aligned with recognized learning goals. The Strand Map Service enables users to navigate interactive visualizations of related learning goals and to request digital library resources aligned with learning goals. These interfaces are created using a programmatic Web service interface that dynamically generates interactive visual components. Preliminary findings suggest that these library interfaces appear to help users stay focused on the scientific content of their information discovery task, as opposed to focusing on the mechanics of searching. 相似文献
16.
Current advances in Task and Motion Planning (TAMP) framework often rely on a specific and static task structure. A task structure is a sequence of how work pieces should be manipulated towards achieving a goal. Such systems can be problematic when task structures change as a result of human performance during human-robot collaboration scenarios in manufacturing or when redundant objects are present in the workspace, for example, during a Package-To- Order scenario with the same object type fulfilling different package configurations. In this paper, we propose a novel integrated TAMP framework that supports learning from human demonstrations while tackling variations in object positions and product configurations during massive-Package-To-Order (mPTO) scenarios in manufacturing as well as during human-robot collaboration scenarios. We design and apply a Graph Neural Network(GNN) based high-level reasoning module that is capable of handling variant goal configurations and can generalize to different task structures. Moreover, we also built a two-level motion module which can produce flexible and collision-free trajectories based on important features and task labels produced by the reasoning module. Through simulations and physical experiments, we show that our framework holds several advantages when compared with state-of-the-art previous work. The advantages include sample-efficiency and generalizability to unseen goal configurations as well as task structures. 相似文献
17.
An approach of Task-Parameterised Learning from Demonstration (TP-LfD) aims at automatically adapting the movements of collaborative robots (cobots) to new settings using knowledge learnt from demonstrated paths. The approach is suitable for encoding complex relations between a cobot and its surrounding, i.e., task-relevant objects. However, further efforts are still required to enhance the intelligence and adaptability of TP-LfD for dynamic tasks. With this aim, this paper presents an improved TP-LfD (iTP-LfD) approach to program cobots adaptively for a variety of industrial tasks. iTP-LfD comprises of three main improvements over other developed TP-LfD approaches: 1) detecting generic visual features for frames of reference (frames) in demonstrations for path reproduction in new settings without using complex computer vision algorithms, 2) minimising redundant frames that belong to the same object in demonstrations using a statistical algorithm, and 3) designing a reinforcement learning algorithm to eliminate irrelevant frames. The distinguishing characteristic of the iTP-LfD approach is that optimal frames are identified from demonstrations by simplifying computational complexity, overcoming occlusions in new settings, and boosting the overall performance. Case studies for a variety of industrial tasks involving different objects and scenarios highlight the adaptability and robustness of the iTP-LfD approach. 相似文献
18.
《Robotics and Autonomous Systems》2006,54(5):409-413
Trajectory learning is a fundamental component in a robot Programming by Demonstration (PbD) system, where often the very purpose of the demonstration is to teach complex manipulation patterns. However, human demonstrations are inevitably noisy and inconsistent. This paper highlights the trajectory learning component of a PbD system for manipulation tasks encompassing the ability to cluster, select, and approximate human demonstrated trajectories. The proposed technique provides some advantages with respect to alternative approaches and is suitable for learning from both individual and multiple user demonstrations. 相似文献
19.
强化学习研究智能体如何从与环境的交互中学习最优的策略,以最大化长期奖赏。由于环境反馈的滞后性,强化学习问题面临巨大的决策空间,进行有效的搜索是获得成功学习的关键。以往的研究从多个角度对策略的搜索进行了探索,在搜索算法方面,研究结果表明基于演化优化的直接策略搜索方法能够获得优于传统方法的性能;在引入外部信息方面,通过加入用户提供的演示,可以有效帮助强化学习提高性能。然而,这两种有效方法的结合却鲜有研究。对用户演示与演化优化的结合进行研究,提出iNEAT+Q算法,尝试将演示数据通过预训练神经网络和引导演化优化的适应值函数的方式与演化强化学习方法结合。初步实验表明,iNEAT+Q较不使用演示数据的演化强化学习方法NEAT+Q有明显的性能改善。 相似文献
20.
Combinatorial explosion of inferences has always been a central problem in artificial intelligence. Although the inferences that can be drawn from a reasoner's knowledge and from available inputs is very large (potentially infinite), the inferential resources available to any reasoning system are limited. With limited inferential capacity and very many potential inferences, reasoners must somehow control the process of inference. Not all inferences are equally useful to a given reasoning system. Any reasoning system that has goals (or any form of a utility function) and acts based on its beliefs indirectly assigns utility to its beliefs. Given limits on the process of inference, and variation in the utility of inferences, it is clear that a reasoner ought to draw the inferences that will be most valuable to it. This paper presents an approach to this problem that makes the utility of a (potential) belief an explicit part of the inference process. The method is to generate explicit desires for knowledge. The question of focus of attention is thereby transformed into two related problems: How can explicit desires for knowledge be used to control inference and facilitate resource-constrained goal pursuit in general? and, Where do these desires for knowledge come from? We present a theory of knowledge goals, or desires for knowledge, and their use in the processes of understanding and learning. The theory is illustrated using two case studies, a natural language understanding program that learns by reading novel or unusual newspaper stories, and a differential diagnosis program that improves its accuracy with experience. 相似文献