首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The study of behavioral and neurophysiological mechanisms involved in rat spatial cognition provides a basis for the development of computational models and robotic experimentation of goal-oriented learning tasks. These models and robotics architectures offer neurobiologists and neuroethologists alternative platforms to study, analyze and predict spatial cognition based behaviors. In this paper we present a comparative analysis of spatial cognition in rats and robots by contrasting similar goal-oriented tasks in a cyclical maze, where studies in rat spatial cognition are used to develop computational system-level models of hippocampus and striatum integrating kinesthetic and visual information to produce a cognitive map of the environment and drive robot experimentation. During training, Hebbian learning and reinforcement learning, in the form of Actor-Critic architecture, enable robots to learn the optimal route leading to a goal from a designated fixed location in the maze. During testing, robots exploit maximum expectations of reward stored within the previously acquired cognitive map to reach the goal from different starting positions. A detailed discussion of comparative experiments in rats and robots is presented contrasting learning latency while characterizing behavioral procedures during navigation such as errors associated with the selection of a non-optimal route, body rotations, normalized length of the traveled path, and hesitations. Additionally, we present results from evaluating neural activity in rats through detection of the immediate early gene Arc to verify the engagement of hippocampus and striatum in information processing while solving the cyclical maze task, such as robots use our corresponding models of those neural structures.  相似文献   

2.
邹强  丛明  刘冬  杜宇  崔瑛雪 《机器人》2018,40(6):894-902
针对移动机器人在非结构环境下的导航任务,受哺乳动物空间认知方式的启发,提出一种基于生物认知进行移动机器人路径规划的方法.结合认知地图特性,模拟海马体的情景记忆形成机理,构建封装了场景感知、状态神经元及位姿感知相关信息的情景认知地图,实现了机器人对环境的认知.基于情景认知地图,以最小事件距离为准则,提出事件序列规划算法用于实时导航过程.实验结果表明,该控制算法能使机器人根据不同任务选择最佳规划路径.  相似文献   

3.
针对传统强化学习算法在训练初期缺乏对周围环境的先验知识,模块化自重构机器人会随机选择动作,导致迭代次数浪费和算法收敛速度缓慢的问题,提出一种两阶段强化学习算法。在第一阶段,利用基于群体和知识共享的Q-learning训练机器人前往网格地图的中心点,以获得一个最优共享Q表。在这个阶段中,为了减少迭代次数,提高算法的收敛速...  相似文献   

4.
针对认知机器人的自主学习问题,提出一种基于操作条件反射原理的学习模型(OCLM).该模型采用状态空间、操作行为空间、概率分布函数、仿生学习机制、系统熵等进行描述,给出状态的"负理想度"的概念,定义了取向函数的计算方法.运用模型对机器人避障导航问题进行仿真实验,并对参数设置进行了讨论.实验结果表明,基于OCLM模型的机器人能通过与环境的交互获得认知,成功避障到达目的地,具有一定的自学习能力,从而表明了模型的有效性.  相似文献   

5.
为了在复杂舞台环境下使用移动机器人实现物品搬运或者载人演出,提出了一种基于深度强化学习的动态路径规划算法。首先通过构建全局地图获取移动机器人周围的障碍物信息,将演员和舞台道具分别分类成动态障碍物和静态障碍物。然后建立局部地图,通过LSTM网络编码动态障碍物信息,使用社会注意力机制计算每个动态障碍物的重要性来实现更好的避障效果。通过构建新的奖励函数来实现对动静态障碍物的不同躲避情况。最后通过模仿学习和优先级经验回放技术来提高网络的收敛速度,从而实现在舞台复杂环境下的移动机器人的动态路径规划。实验结果表明,该网络的收敛速度明显提高,在不同障碍物环境下都能够表现出好的动态避障效果。  相似文献   

6.
This paper presents a technical approach to robot learning of motor skills which combines active intrinsically motivated learning with imitation learning. Our algorithmic architecture, called SGIM-D, allows efficient learning of high-dimensional continuous sensorimotor inverse models in robots, and in particular learns distributions of parameterised motor policies that solve a corresponding distribution of parameterised goals/tasks. This is made possible by the technical integration of imitation learning techniques within an algorithm for learning inverse models that relies on active goal babbling. After reviewing social learning and intrinsic motivation approaches to action learning, we describe the general framework of our algorithm, before detailing its architecture. In an experiment where a robot arm has to learn to use a flexible fishing line, we illustrate that SGIM-D efficiently combines the advantages of social learning and intrinsic motivation and benefits from human demonstration properties to learn how to produce varied outcomes in the environment, while developing more precise control policies in large spaces.  相似文献   

7.
Recent developments in sensor technology have made it feasible to use mobile robots in several fields, but robots still lack the ability to accurately sense the environment. A major challenge to the widespread deployment of mobile robots is the ability to function autonomously, learning useful models of environmental features, recognizing environmental changes, and adapting the learned models in response to such changes. This article focuses on such learning and adaptation in the context of color segmentation on mobile robots in the presence of illumination changes. The main contribution of this article is a survey of vision algorithms that are potentially applicable to color-based mobile robot vision. We therefore look at algorithms for color segmentation, color learning and illumination invariance on mobile robot platforms, including approaches that tackle just the underlying vision problems. Furthermore, we investigate how the inter-dependencies between these modules and high-level action planning can be exploited to achieve autonomous learning and adaptation. The goal is to determine the suitability of the state-of-the-art vision algorithms for mobile robot domains, and to identify the challenges that still need to be addressed to enable mobile robots to learn and adapt models for color, so as to operate autonomously in natural conditions.  相似文献   

8.
Selection of a robot for a specific industrial application is one of the most challenging problems in real time manufacturing environment. It has become more and more complicated due to increase in complexity, advanced features and facilities that are continuously being incorporated into the robots by different manufacturers. At present, different types of industrial robots with diverse capabilities, features, facilities and specifications are available in the market. Manufacturing environment, product design, production system and cost involved are some of the most influencing factors that directly affect the robot selection decision. The decision maker needs to identify and select the best suited robot in order to achieve the desired output with minimum cost and specific application ability. This paper attempts to solve the robot selection problem using two most appropriate multi-criteria decision-making (MCDM) methods and compares their relative performance for a given industrial application. The first MCDM approach is ‘VIsekriterijumsko KOmpromisno Rangiranje’ (VIKOR), a compromise ranking method and the other one is ‘ELimination and Et Choice Translating REality’ (ELECTRE), an outranking method. Two real time examples are cited in order to demonstrate and validate the applicability and potentiality of both these MCDM methods. It is observed that the relative rankings of the alternative robots as obtained using these two MCDM methods match quite well with those as derived by the past researchers.  相似文献   

9.
We propose a machine-learning based multi-level cognitive model inspired from early-ages’ cognitive development of human’s locomotion skills for humanoid robot’s walking modeling. Contrary to the most of already introduced works dealing with biped robot’s walking modeling, which place the problem within the context of controlling specific kinds of biped robots, the proposed model attends to a global concept of biped walking ability’s construction independently from the robot to which the concept may be applied. The chief-benefit of the concept is that the issued machine-learning based structure takes advantage from “learning” capacity and “generalization” propensity of such models: allowing a precious potential to deal with high dimensionality, nonlinearity and empirical proprioceptive or exteroceptive information. Case studies and validation results are reported and discussed evaluating potential performances of the proposed approach.  相似文献   

10.
Emergence of stable gaits in locomotion robots is studied in this paper. A classifier system, implementing an instance-based reinforcement-learning scheme, is used for the sensory-motor control of an eight-legged mobile robot and for the synthesis of the robot gaits. The robot does not have a priori knowledge of the environment and its own internal model. It is only assumed that the robot can acquire stable gaits by learning how to reach a goal area. During the learning process the control system is self-organized by reinforcement signals. Reaching the goal area defines a global reward. Forward motion gets a local reward, while stepping back and falling down get a local punishment. As learning progresses, the number of the action rules in the classifier systems is stabilized to a certain level, corresponding to the acquired gait patterns. Feasibility of the proposed self-organized system is tested under simulation and experiment. A minimal simulation model that does not require sophisticated computational schemes is constructed and used in simulations. The simulation data, evolved on the minimal model of the robot, is downloaded to the control system of the real robot. Overall, of 10 simulation data seven are successful in running the real robot.  相似文献   

11.
This paper presents a robot architecture with spatial cognition and navigation capabilities that captures some properties of the rat brain structures involved in learning and memory. This architecture relies on the integration of kinesthetic and visual information derived from artificial landmarks, as well as on Hebbian learning, to build a holistic topological-metric spatial representation during exploration, and employs reinforcement learning by means of an Actor-Critic architecture to enable learning and unlearning of goal locations. From a robotics perspective, this work can be placed in the gap between mapping and map exploitation currently existent in the SLAM literature. The exploitation of the cognitive map allows the robot to recognize places already visited and to find a target from any given departure location, thus enabling goal-directed navigation. From a biological perspective, this study aims at initiating a contribution to experimental neuroscience by providing the system as a tool to test with robots hypotheses concerned with the underlying mechanisms of rats’ spatial cognition. Results from different experiments with a mobile AIBO robot inspired on classical spatial tasks with rats are described, and a comparative analysis is provided in reference to the reversal task devised by O’Keefe in 1983.  相似文献   

12.
A new construction method using robots is spreading widely among construction sites in order to overcome labour shortages and frequent construction accidents. Along with economical efficiency, safety is a very important factor for evaluating the use of construction robots in construction sites. However, the quantitative evaluation of safety is difficult compared with that of economical efficiency. In this study, we suggested a safety evaluation methodology by defining the ‘worker’ and ‘work conditions’ as two risk factors, defining the ‘worker’ factor as posture load and the ‘work conditions’ factor as the work environment and the risk exposure time. The posture load evaluation reflects the risk of musculoskeletal disorders which can be caused by work posture and the risk of accidents which can be caused by reduced concentration. We evaluated the risk factors that may cause various accidents such as falling, colliding, capsizing, and squeezing in work environments, and evaluated the operational risk by considering worker exposure time to risky work environments. With the results of the evaluations for each factor, we calculated the general operational risk and deduced the improvement ratio in operational safety by introducing a construction robot. To verify these results, we compared the safety of the existing human manual labour and the proposed robotic labour construction methods for manipulating large glass panels.  相似文献   

13.
We introduce the Self-Adaptive Goal Generation Robust Intelligent Adaptive Curiosity (SAGG-RIAC) architecture as an intrinsically motivated goal exploration mechanism which allows active learning of inverse models in high-dimensional redundant robots. This allows a robot to efficiently and actively learn distributions of parameterized motor skills/policies that solve a corresponding distribution of parameterized tasks/goals. The architecture makes the robot sample actively novel parameterized tasks in the task space, based on a measure of competence progress, each of which triggers low-level goal-directed learning of the motor policy parameters that allow to solve it. For both learning and generalization, the system leverages regression techniques which allow to infer the motor policy parameters corresponding to a given novel parameterized task, and based on the previously learnt correspondences between policy and task parameters.We present experiments with high-dimensional continuous sensorimotor spaces in three different robotic setups: (1) learning the inverse kinematics in a highly-redundant robotic arm, (2) learning omnidirectional locomotion with motor primitives in a quadruped robot, and (3) an arm learning to control a fishing rod with a flexible wire. We show that (1) exploration in the task space can be a lot faster than exploration in the actuator space for learning inverse models in redundant robots; (2) selecting goals maximizing competence progress creates developmental trajectories driving the robot to progressively focus on tasks of increasing complexity and is statistically significantly more efficient than selecting tasks randomly, as well as more efficient than different standard active motor babbling methods; (3) this architecture allows the robot to actively discover which parts of its task space it can learn to reach and which part it cannot.  相似文献   

14.
The purpose of this research is to identify the causal attributions of business computing students in an introductory computer programming course, in the computer science department at Notre Dame University, Louaize. Forty-five male and female undergraduates who completed the computer programming course that extended for a 13-week semester participated. Narrative interviews were conducted to obtain their perceptions. While some research confirmed that the four most responsible causes for success and failure in achievement contexts are ability, effort, task difficulty, and luck, this research shows that in its context ‘ability’ and ‘luck’ were absent, and ‘task difficulty’ and ‘effort’ were almost absent. In all, participants made 10 causal attributions that were either cultural or specific to computer programming. The 10 causal attributions are ‘learning strategy’, ‘lack of study’, ‘lack of practice’, ‘subject difficulty’, ‘lack of effort’, ‘appropriate teaching method’, ‘exam anxiety’, ‘cheating’, ‘lack of time’, and ‘unfair treatment’. All high achievers cited appropriate ‘learning strategy’.  相似文献   

15.
Computer programming skills constitute one of the core competencies that graduates from many disciplines, such as engineering and computer science, are expected to possess. Developing good programming skills typically requires students to do a lot of practice, which cannot sustain unless they are adequately motivated. This paper reports a preliminary study that investigates the key motivating factors affecting learning among university undergraduate students taking computer programming courses. These courses are supported by an e-learning system – Programming Assignment aSsessment System (PASS), which aims at providing an infrastructure and facilitation to students learning computer programming. A research model is adopted linking various motivating factors, self-efficacy, as well as the effect due to the e-learning system. Some factors are found to be notably more motivating, namely, ‘individual attitude and expectation’, ‘clear direction’, and ‘reward and recognition’. The results also suggest that a well facilitated e-learning setting can enhance learning motivation and self-efficacy.  相似文献   

16.
Reinforcement learning (RL) for robot control is an important technology for future robots since it enables us to design a robot’s behavior using the reward function. However, RL for high degree-of-freedom robot control is still an open issue. This paper proposes a discrete action space DCOB which is generated from the basis functions (BFs) given to approximate a value function. The remarkable feature is that, by reducing the number of BFs to enable the robot to learn quickly the value function, the size of DCOB is also reduced, which improves the learning speed. In addition, a method WF-DCOB is proposed to enhance the performance, where wire-fitting is utilized to search for continuous actions around each discrete action of DCOB. We apply the proposed methods to motion learning tasks of a simulated humanoid robot and a real spider robot. The experimental results demonstrate outstanding performance.  相似文献   

17.
In this paper a new cooperative collision-avoidance method for multiple, nonholonomic robots based on Bernstein–Bézier curves is presented. The main contribution focuses on an optimal, cooperative, collision avoidance for a multi-robot system where the velocities and accelerations of the mobile robots are constrained and the start and the goal velocity are defined for each robot. The optimal path of each robot, from the start pose to the goal pose, is obtained by minimizing the penalty function, which takes into account the sum of all the path lengths subjected to the distances between the robots, which should be larger than the minimum distance defined as the safety distance, and subjected to the velocities and accelerations, which should be lower than the maximum allowed for each robot. The model-predictive trajectory tracking is used to drive the robots on the obtained reference paths. The results of the path planning, real experiments and some future work ideas are discussed.  相似文献   

18.
《Advanced Robotics》2013,27(10):1215-1229
Reinforcement learning is the scheme for unsupervised learning in which robots are expected to acquire behavior skills through self-explorations based on reward signals. There are some difficulties, however, in applying conventional reinforcement learning algorithms to motion control tasks of a robot because most algorithms are concerned with discrete state space and based on the assumption of complete observability of the state. Real-world environments often have partial observablility; therefore, robots have to estimate the unobservable hidden states. This paper proposes a method to solve these two problems by combining the reinforcement learning algorithm and a learning algorithm for a continuous time recurrent neural network (CTRNN). The CTRNN can learn spatio-temporal structures in a continuous time and space domain, and can preserve the contextual flow by a self-organizing appropriate internal memory structure. This enables the robot to deal with the hidden state problem. We carried out an experiment on the pendulum swing-up task without rotational speed information. As a result, this task is accomplished in several hundred trials using the proposed algorithm. In addition, it is shown that the information about the rotational speed of the pendulum, which is considered as a hidden state, is estimated and encoded on the activation of a context neuron.  相似文献   

19.
深度学习在智能机器人中的应用研究综述   总被引:1,自引:0,他引:1  
龙慧  朱定局  田娟 《计算机科学》2018,45(Z11):43-47, 52
机器人发展的趋势是人工智能化,深度学习是智能机器人的前沿技术,也是机器学习领域的新课题。深度学习技术被广泛运用于农业、工业、军事、航空等领域,与机器人的有机结合能设计出具有高工作效率、高实时性、高精确度的智能机器人。为了增强智能机器人在各方面的能力,使其更智能化,介绍了深度学习与机器人有关的研究项目与深度学习在机器人中的各种应用,包括室内和室外的场景识别、机器人的工业服务和家庭服务以及多机器人协作等。最后,对深度学习在智能机器人中应用的未来发展、可能面临的机遇和挑战等进行了讨论。  相似文献   

20.
为了控制移动机器人在人群密集的复杂环境中高效友好地完成避障任务,本文提出了一种人群环境中基于深度强化学习的移动机器人避障算法。首先,针对深度强化学习算法中值函数网络学习能力不足的情况,基于行人交互(crowd interaction)对值函数网络做了改进,通过行人角度网格(angel pedestrian grid)对行人之间的交互信息进行提取,并通过注意力机制(attention mechanism)提取单个行人的时序特征,学习得到当前状态与历史轨迹状态的相对重要性以及对机器人避障策略的联合影响,为之后多层感知机的学习提供先验知识;其次,依据行人空间行为(human spatial behavior)设计强化学习的奖励函数,并对机器人角度变化过大的状态进行惩罚,实现了舒适避障的要求;最后,通过仿真实验验证了人群环境中基于深度强化学习的移动机器人避障算法在人群密集的复杂环境中的可行性与有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号