首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
We propose an integrated technique of genetic programming (GP) and reinforcement learning (RL) to enable a real robot to adapt its actions to a real environment. Our technique does not require a precise simulator because learning is achieved through the real robot. In addition, our technique makes it possible for real robots to learn effective actions. Based on this proposed technique, we acquire common programs, using GP, which are applicable to various types of robots. Through this acquired program, we execute RL in a real robot. With our method, the robot can adapt to its own operational characteristics and learn effective actions. In this paper, we show experimental results from two different robots: a four-legged robot "AIBO" and a humanoid robot "HOAP-1." We present results showing that both effectively solved the box-moving task; the end result demonstrates that our proposed technique performs better than the traditional Q-learning method.  相似文献   

2.
Scaffolding is a process of transferring learned skills to new and more complex tasks through arranged experience in open-ended development. In this paper, we propose a developmental learning architecture that enables a robot to transfer skills acquired in early learning settings to later more complex task settings. We show that a basic mechanism that enables this transfer is sequential priming combined with attention, which is also the driving mechanism for classical conditioning, secondary conditioning, and instrumental conditioning in animal learning. A major challenge of this work is that training and testing must be conducted in the same program operational mode through online, real-time interactions between the agent and the trainers. In contrast with former modeling studies, the proposed architecture does not require the programmer to know the tasks to be learned and the environment is uncontrolled. All possible perceptions and actions, including the actual number of classes, are not available until the programming is finished and the robot starts to learn in the real world. Thus, a predesigned task-specific symbolic representation is not suited for such an open-ended developmental process. Experimental results on a robot are reported in which the trainer shaped the behaviors of the agent interactively, continuously, and incrementally through verbal commands and other sensory signals so that the robot learns new and more complex sensorimotor tasks by transferring sensorimotor skills learned in earlier periods of open-ended development  相似文献   

3.
From infants to adults, each individual undergoes changes both physically and mentally through interaction with environments. These cognitive developments are usually staged, exhibited as behaviour changes and supported by neural growth and shrinking in the brain. The ultimate goal for an intelligent artificial system is to automatically build its skills in a similar way as the mental development in humans and animals, and adapt to different environments. In this paper, we present an approach to constructing robot coordination skills by developmental learning inspired by developmental psychology and neuroscience. In particular, we investigated the learning of two types of robot coordination skills: intra-modality mapping such as sensor-motor coordination of the arm; and inter-modality mapping such as eye/hand coordination. A self-organising radial basis function (RBF) network is used as the substrate to support the learning process. The RBF network grows or shrinks according to the novelty of the data obtained during the robot interaction with environment, and its parameters are adjusted by a simplified extended Kalman filter algorithm. The paper further reveals the possible biological evidence to support both the system architecture, the learning algorithm, and the adaptation of the system to its bodily changes such as in tool-use. The learning error was regarded as intrinsic motivation to drive the system to actively explore the workspace and reduce the learning errors. We tested our algorithms on a laboratory robot system with two industrial quality manipulator arms and a motorised pan/tilt head carrying a colour CCD camera. The experimental results demonstrated that the system can develop its intra-modal and inter-modal coordination skills by constructing mapping networks in a similar way as humans and animals during their early cognition development. In order to adapt to different tool sizes, the system can quickly reuse and adjust its learned knowledge in terms of the number of neurons, the size of receptive field of each neuron and the contribution from each neuron in the network. The active learning approach greatly reduces the nonuniform distribution of the learning errors.  相似文献   

4.
In this paper, we show that through self-interaction and self-observation, an anthropomorphic robot equipped with a range camera can learn object affordances and use this knowledge for planning. In the first step of learning, the robot discovers commonalities in its action-effect experiences by discovering effect categories. Once the effect categories are discovered, in the second step, affordance predictors for each behavior are obtained by learning the mapping from the object features to the effect categories. After learning, the robot can make plans to achieve desired goals, emulate end states of demonstrated actions, monitor the plan execution and take corrective actions using the perceptual structures employed or discovered during learning. We argue that the learning system proposed shares crucial elements with the development of infants of 7–10 months age, who explore the environment and learn the dynamics of the objects through goal-free exploration. In addition, we discuss goal emulation and planning in relation to older infants with no symbolic inference capability and non-linguistic animals which utilize object affordances to make action plans.  相似文献   

5.
Affordances encode relationships between actions, objects, and effects. They play an important role on basic cognitive capabilities such as prediction and planning. We address the problem of learning affordances through the interaction of a robot with the environment, a key step to understand the world properties and develop social skills. We present a general model for learning object affordances using Bayesian networks integrated within a general developmental architecture for social robots. Since learning is based on a probabilistic model, the approach is able to deal with uncertainty, redundancy, and irrelevant information. We demonstrate successful learning in the real world by having an humanoid robot interacting with objects. We illustrate the benefits of the acquired knowledge in imitation games.  相似文献   

6.
In this paper, we describe development of a mobile robot which does unsupervised learning for recognizing an environment from action sequences. We call this novel recognition approach action-based environment modeling (AEM). Most studies on recognizing an environment have tried to build precise geometric maps with high sensitive and global sensors. However such precise and global information may be hardly obtained in a real environment, and may be unnecessary to recognize an environment. Furthermore unsupervised-learning is necessary for recognition in an unknown environment without help of a teacher. Thus we attempt to build a mobile robot which does unsupervised-learning to recognize environments with low sensitive and local sensors. The mobile robot is behavior-based and does wall-following in enclosures (called rooms). Then the sequences of actions executed in each room are transformed into environment vectors for self-organizing maps. Learning without a teacher is done, and the robot becomes able to identify rooms. Moreover, we develop a method to identify environments independent of a start point using a partial sequence. We have fully implemented the system with a real mobile robot, and made experiments for evaluating the ability. As a result, we found out that the environment recognition was done well and our method was adaptive to noisy environments.  相似文献   

7.
Intrinsic Motivation Systems for Autonomous Mental Development   总被引:1,自引:0,他引:1  
Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development. The complexity of the robot's activities autonomously increases and complex developmental sequences self-organize without being constructed in a supervised manner. Two experiments are presented illustrating the stage-like organization emerging with this mechanism. In one of them, a physical robot is placed on a baby play mat with objects that it can learn to manipulate. Experimental results show that the robot first spends time in situations which are easy to learn, then shifts its attention progressively to situations of increasing difficulty, avoiding situations in which nothing can be learned. Finally, these various results are discussed in relation to more complex forms of behavioral organization and data coming from developmental psychology  相似文献   

8.
《Advanced Robotics》2013,27(10):1183-1199
Robots have to deal with an enormous amount of sensory stimuli. One solution in making sense of them is to enable a robot system to actively search for cues that help structuring the information. Studies with infants reveal that parents support the learning-process by modifying their interaction style, dependent on their child's developmental age. In our study, in which parents demonstrated everyday actions to their preverbal children (8–11 months old), our aim was to identify objective parameters for multimodal action modification. Our results reveal two action parameters being modified in adult–child interaction: roundness and pace. Furthermore, we found that language has the power to help children structuring actions sequences by synchrony and emphasis. These insights are discussed with respect to the built-in attention architecture of a socially interactive robot, which enables it to understand demonstrated actions. Our algorithmic approach towards automatically detecting the task structure in child-designed input demonstrates the potential impact of insights from developmental learning on robotics. The presented findings pave the way to automatically detect when to imitate in a demonstration task.  相似文献   

9.
机器人动态神经网络导航算法的研究和实现   总被引:1,自引:0,他引:1  
针对Pioneer3-DX 移动机器人, 提出了基于强化学习的自主导航策略, 完成了基于动态神经网络的移动机器人导航算法设计. 动态神经网络可以根据机器人环境状态的复杂程度自动地调整其结构, 实时地实现机器人的状态与其导航动作之间的映射关系, 有效地解决了强化学习中状态变量表的维数爆炸问题. 通过对Pioneer3-DX移动机器人导航进行仿真和实物实验, 证明该方法的有效性, 且导航效果明显优于人工势场法.  相似文献   

10.
With the advancements in technology, robots have gradually replaced humans in different aspects. Allowing robots to handle multiple situations simultaneously and perform different actions depending on the situation has since become a critical topic. Currently, training a robot to perform a designated action is considered an easy task. However, when a robot is required to perform actions in different environments, both resetting and retraining are required, which are time-consuming and inefficient. Therefore, allowing robots to autonomously identify their environment can significantly reduce the time consumed. How to employ machine learning algorithms to achieve autonomous robot learning has formed a research trend in current studies. In this study, to solve the aforementioned problem, a proximal policy optimization algorithm was used to allow a robot to conduct self-training and select an optimal gait pattern to reach its destination successfully. Multiple basic gait patterns were selected, and information-maximizing generative adversarial nets were used to generate gait patterns and allow the robot to choose from numerous gait patterns while walking. The experimental results indicated that, after self-learning, the robot successfully made different choices depending on the situation, verifying this approach’s feasibility.  相似文献   

11.
Direct word discovery from audio speech signals is a very difficult and challenging problem for a developmental robot. Human infants are able to discover words directly from speech signals, and, to understand human infants’ developmental capability using a constructive approach, it is very important to build a machine learning system that can acquire knowledge about words and phonemes, i.e. a language model and an acoustic model, autonomously in an unsupervised manner. To achieve this, the nonparametric Bayesian double articulation analyzer (NPB-DAA) with the deep sparse autoencoder (DSAE) is proposed in this paper. The NPB-DAA has been proposed to achieve totally unsupervised direct word discovery from speech signals. However, the performance was still unsatisfactory, although it outperformed pre-existing unsupervised learning methods. In this paper, we integrate the NPB-DAA with the DSAE, which is a neural network model that can be trained in an unsupervised manner, and demonstrate its performance through an experiment about direct word discovery from auditory speech signals. The experiment shows that the combined method, the NPB-DAA with the DSAE, outperforms pre-existing unsupervised learning methods, and shows state-of-the-art performance. It is also shown that the proposed method outperforms several standard speech recognizer-based methods with true word dictionaries.  相似文献   

12.
In this paper, we address the autonomous control of a 3D snake-like robot through the use of reinforcement learning, and we apply it in a dynamic environment. In general, snake-like robots have high mobility that is realized by many degrees of freedom, and they can move over dynamically shifting environments such as rubble. However, this freedom and flexibility leads to a state explosion problem, and the complexity of the dynamic environment leads to incomplete learning by the robot. To solve these problems, we focus on the properties of the actual operating environment and the dynamics of a mechanical body. We design the body of the robot so that it can abstract small, but necessary state-action space by utilizing these properties, and we make it possible to apply reinforcement learning. To demonstrate the effectiveness of the proposed snake-like robot, we conduct experiments; from the experimental results we conclude that learning is completed within a reasonable time, and that effective behaviors for the robot to adapt itself to an unknown 3D dynamic environment were realized.  相似文献   

13.
This paper addresses a new method for combination of supervised learning and reinforcement learning (RL). Applying supervised learning in robot navigation encounters serious challenges such as inconsistent and noisy data, difficulty for gathering training data, and high error in training data. RL capabilities such as training only by one evaluation scalar signal, and high degree of exploration have encouraged researchers to use RL in robot navigation problem. However, RL algorithms are time consuming as well as suffer from high failure rate in the training phase. Here, we propose Supervised Fuzzy Sarsa Learning (SFSL) as a novel idea for utilizing advantages of both supervised and reinforcement learning algorithms. A zero order Takagi–Sugeno fuzzy controller with some candidate actions for each rule is considered as the main module of robot's controller. The aim of training is to find the best action for each fuzzy rule. In the first step, a human supervisor drives an E-puck robot within the environment and the training data are gathered. In the second step as a hard tuning, the training data are used for initializing the value (worth) of each candidate action in the fuzzy rules. Afterwards, the fuzzy Sarsa learning module, as a critic-only based fuzzy reinforcement learner, fine tunes the parameters of conclusion parts of the fuzzy controller online. The proposed algorithm is used for driving E-puck robot in the environment with obstacles. The experiment results show that the proposed approach decreases the learning time and the number of failures; also it improves the quality of the robot's motion in the testing environments.  相似文献   

14.
This paper reports that the superposition of a small set of behaviors, learned via teleoperation, can lead to robust completion of an articulated reach-and-grasp task. The results support the hypothesis that a robot can learn to interact purposefully with its environment through a developmental acquisition of sensory-motor coordination. Teleoperation can bootstrap the process by enabling the robot to observe its own sensory responses to actions that lead to specific outcomes within an environment. It is shown that a reach-and-grasp task, learned by an articulated robot through a small number of teleoperated trials, can be performed autonomously with success in the face of significant variations in the environment and perturbations of the goal. In particular, teleoperation of the robot to reach and grasp an object at nine different locations in its workspace enabled robust autonomous performance of the task anywhere within the workspace. Superpositioning was performed using the Verbs and Adverbs algorithm that was developed originally for the graphical animation of articulated characters. The work was performed on Robonaut, the NASA space-capable humanoid at Johnson Space Center, Houston, TX.  相似文献   

15.
The area of competitive robotic systems usually leads to highly complicated strategies that must be achieved by complex learning architectures since analytic solutions are unpractical or completely unfeasible. In this work we design an experiment in order to study and validate a model about the complex phenomena of adaptation. In particular, we study a reinforcement learning problem that comprises a complex predator–protector–prey system composed by three different robots: a pure bio-mimetic reactive (in Brook’s sense, i.e. without reasoning and representation) predator-like robot, a protector-like robot with reinforcement learning capabilities and a pure bio-mimetic reactive prey-like robot. From the high-level point of view, we are interested in studying whether the Law of Adaptation is useful enough to model and explain the whole learning process occurring in this multi-robot system. From a low-level point of view, our interest is in the design of a learning system capable of solving such a complex competitive predator–protector–prey system optimally. We show how this learning problem can be addressed and solved effectively by means of a reinforcement learning setup that uses abstract actions to select a goal or target towards which a pure bio-mimetic reactive robot must navigate. The experimental results clearly show how the Law of Adaptation fits this complex learning system and that the proposed Reinforcement Learning setup is able to find an optimal policy to control the defender robot in its role of protecting the prey against the predator robot.  相似文献   

16.
The aim of the project described in this paper was to investigate robot learning at a most fundamental level. The project focused on the transition between organisms with innate behaviors and organisms that have the most rudimentary capability of learning through their personal interaction with their environment. It was assumed that the innate behaviors gave basic survival competence but no learning ability. By observing the interaction between their innate behaviors and the organism's environment it was reasoned that the organism should be able to learn how to modify its actions in a way that improves its performance. If a learning system is given more information than it requires then, when it is successful, it is difficult to say which pieces of information contribute to the success. For this reason the information available to the learning system was kept to an absolute minimum. In order to provide a practical test of the learning scheme developed in this project, the robot environment EDEN was constructed. Within EDEN a robot's actions influence its internal energy reserves. The environment incorporates sources of energy, and it also involves situations that use additional energy or reduce energy consumption. A successful learning scheme was developed purely based on the recorded history of the robot's interactions with its environment and the knowledge that the robot's innate behavior was reactive. This learning scheme allowed the robot to improve its energy management by exhibiting classical conditioning and a restricted form of operant conditioning.  相似文献   

17.
Humans can learn a language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form symbol systems and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted regarding the construction of robotic systems and machine learning methods that can learn a language through embodied multimodal interaction with their environment and other systems. Understanding human?-social interactions and developing a robot that can smoothly communicate with human users in the long term require an understanding of the dynamics of symbol systems. The embodied cognition and social interaction of participants gradually alter a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER represents a constructive approach towards a symbol emergence system. The symbol emergence system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e. humans and developmental robots. In this paper, specifically, we describe some state-of-art research topics concerning SER, such as multimodal categorization, word discovery, and double articulation analysis. They enable robots to discover words and their embodied meanings from raw sensory-motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions for research in SER.  相似文献   

18.
Rapid, safe, and incremental learning of navigation strategies   总被引:1,自引:0,他引:1  
In this paper we propose a reinforcement connectionist learning architecture that allows an autonomous robot to acquire efficient navigation strategies in a few trials. Besides rapid learning, the architecture has three further appealing features. First, the robot improves its performance incrementally as it interacts with an initially unknown environment, and it ends up learning to avoid collisions even in those situations in which its sensors cannot detect the obstacles. This is a definite advantage over nonlearning reactive robots. Second, since it learns from basic reflexes, the robot is operational from the very beginning and the learning process is safe. Third, the robot exhibits high tolerance to noisy sensory data and good generalization abilities. All these features make this learning robot's architecture very well suited to real-world applications. We report experimental results obtained with a real mobile robot in an indoor environment that demonstrate the appropriateness of our approach to real autonomous robot control.  相似文献   

19.
One of the applications of service robots with a greater social impact is the assistance to elderly or disabled people. In these applications, assistant robots must robustly navigate in structured indoor environments such as hospitals, nursing homes or houses, heading from room to room to carry out different nursing or service tasks. Among the main requirements of these robotic aids, one that will determine its future commercial feasibility, is the easy installation of the robot in new working domains without long, tedious or complex configuration steps. This paper describes the navigation system of the assistant robot called SIRA, developed in the Electronics Department of the University of Alcalá, focusing on the learning module, specially designed to make the installation of the robot easier and faster in new environments. To cope with robustness and reliability requirements, the navigation system uses probabilistic reasoning (POMDPs) to globally localize the robot and to direct its goal-oriented actions. The proposed learning module fast learns the Markov model of a new environment by means of an exploration stage that takes advantage of human–robot interfaces (basically speech) and user–robot cooperation to accelerate model acquisition. The proposed learning method, based on a modification of the EM algorithm, is able to robustly explore new environments with a low number of corridor traversals, as shown in some experiments carried out with SIRA.  相似文献   

20.
In this paper we describe a biologically constrained architecture for developmental learning of eye–head gaze control on an iCub robot. In contrast to other computational implementations, the developmental approach aims to acquire sensorimotor competence through growth processes modelled on data and theory from infant psychology. Constraints help shape learning in infancy by limiting the complexity of interactions between the body and environment, and we use this idea to produce efficient, effective learning in autonomous robots. Our architecture is based on current thinking surrounding the gaze mechanism, and experimentally derived models of stereotypical eye–head gaze contributions. It is built using our proven constraint-based field-mapping approach. We identify stages in the development of infant gaze control, and propose a framework of artificial constraints to shape learning on the robot in a similar manner. We demonstrate the impact these constraints have on learning, and the resulting ability of the robot to make controlled gaze shifts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号