首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper proposes two co-adaptation schemes of self-organizing maps that incorporate the Kohonen's learning into the GA evolution in an attempt to find an optimal vector quantization codebook of images. The Kohonen's learning rule used for vector quantization of images is sensitive to the choice of its initial parameters and the resultant codebook does not guarantee a minimum distortion. To tackle these problems, we co-adapt the codebooks by evolution and learning in a way that the evolution performs the global search and makes inter-codebook adjustments by altering the codebook structures while the learning performs the local search and makes intra-codebook adjustments by making each codebook's distortion small. Two kinds of co-adaptation schemes such as Lamarckian and Baldwin co-adaptation are considered in our work. Simulation results show that the evolution guided by a local learning provides the fast convergence, the co-adapted codebook produces better reconstruction image quality than the non-learned equivalent, and Lamarckian co-adaptation turns out more appropriate for the VQ problem.  相似文献   

2.
Ambient systems are populated by many heterogeneous devices to provide adequate services to their users. The adaptation of an ambient system to the specific needs of its users is a challenging task. Because human–system interaction has to be as natural as possible, we propose an approach based on Learning from Demonstration (LfD). LfD is an interesting approach to generalize what has been observed during the demonstration to similar situations. However, using LfD in ambient systems needs adaptivity of the learning technique. We present ALEX, a multi-agent system able to dynamically learn and reuse contexts from demonstrations performed by a tutor. The results of the experiments performed on both a real and a virtual robot show interesting properties of our technology for ambient applications.  相似文献   

3.
We described a new preteaching method for re-inforcement learning using a self-organizing map (SOM). The purpose is to increase the learning rate using a small amount of teaching data generated by a human expert. In our proposed method, the SOM is used to generate the initial teaching data for the reinforcement learning agent from a small amount of teaching data. The reinforcement learning function of the agent is initialized by using the teaching data generated by the SOM in order to increase the probability of selecting the optimal actions it estimates. Because the agent can get high rewards from the start of reinforcement learning, it is expected that the learning rate will increase. The results of a mobile robot simulation showed that the learning rate had increased even though the human expert had showed only a small amount of teaching data. This work was presented in part at the 7th International Symposium on Artificial Life and Robotics, Oita, Japan, January 16–18, 2002  相似文献   

4.
Several missions including surveillance, exploration, search-and-track, and lifting of heavy loads are best accomplished by multiple unmanned aerial vehicles (UAVs). Another important advantage to utilizing multiple vehicles is a reduction in the risk to successful completion of a mission due to the loss of a single vehicle. This increased robustness can lead to a commensurate decrease in vehicle specifications and cost, further improving the argument for swarm operations. This paper describes the development of an adaptive configuration controller for multiple vehicles executing a cooperative task in the presence of parametric uncertainty. A novel adaptive outer-loop controller that uses both local and global information is presented.  相似文献   

5.
In this work a learning algorithm is proposed for the formation of topology preserving maps. In the proposed algorithm the weights are updated incrementally using a higher-order difference equation, which implements a low-pass digital filter. It is shown that by suitably choosing the filter the learning process can adaptively follow a specific dynamic. Numerical results, for time-varying and static distributions, show the potential of the proposed method for unsupervised learning.  相似文献   

6.
This paper is focused on the study of self-organizing team’s behaviors which are dependent on the interaction rules and the decision factors of team members. The self-organizing team’s behavior means that team members work unconditionally with one of the three work attitudes (diligence, average, and shirking). A small-world network is suggested as the basic relationships of team members. Different from the traditional models, Reciprocators encourage their friends if they work diligently and punish them if they shirk work. It is supposed that team member’s decision of choosing work attitude depends on four decision factors, humanity, herd instinct, rationality, and follower tendency. Firstly, all of the four decision factors’ weights are supposed as 0.25. Multiple experiments were conducted to analyze the behavior of a team by a multi-agent experiment system. It is found that, in order to increase the fraction of diligent team members, different strategies should be used under different Reciprocators’ fractions. Increasing Reciprocators’ fraction is beneficial to the increase of diligent members; however, the increase rate will slow down after an inflexion (here it means the inflexion of Reciprocators’ fraction). After the previous experiments study, extended experiments were developed to work on the influence of the four factors’ different weights. A self-adaptive algorithm is suggested to achieve the four decision factors’ weights. The results of self-adaptive algorithm have different influences on the team’s behaviors under different fractions of Reciprocators. Finally, influences of members’ different relationships are studied by other experiments. It is also proved that the fraction of diligent members is not dependent on the structure of team members’ relationships. The results demonstrate that the self-organizing team’s behavior can be significantly influenced by its scenario while managing a self-organizing team.  相似文献   

7.
Adaptive immunity based reinforcement learning   总被引:2,自引:2,他引:0  
Recently much attention has been paid to intelligent systems which can adapt themselves to dynamic and/or unknown environments by the use of learning methods. However, traditional learning methods have a disadvantage that learning requires enormously long amounts of time with the degree of complexity of systems and environments to be considered. We thus propose a novel reinforcement learning method based on adaptive immunity. Our proposed method can provide a near-optimal solution with less learning time by self-learning using the concept of adaptive immunity. The validity of our method is demonstrated through some simulations with Sutton’s maze problem. This work was present in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

8.
Team development and group processes of virtual learning teams   总被引:2,自引:0,他引:2  
This study describes the community building process of virtual learning teams as they form, establish roles and group norms, and address conflict. Students enrolled in an HRD masters program taught entirely online were studied to determine (1) how virtual learning teams develop their group process, and (2) what process and strategies they use as they work through the stages of group development. Both quantitative and qualitative methods of inquiry were used to capture the dynamic interaction within groups and the underlying factors that guided group process and decision-making. The results show that virtual learning groups can collaborate effectively from a distance to accomplish group tasks. The development of virtual learning teams is closely connected to the timeline for their class projects. Virtual teams are also similar in terms of their task process and the use of communication technologies. In contrast to face-to-face teams, the leadership role of virtual teams is shared among team members. Recommendations are discussed in order to facilitate peak integration of virtual learning teams into Internet-based training courses.  相似文献   

9.
This study proposes an indirect adaptive self-organizing RBF neural control (IASRNC) system which is composed of a feedback controller, a neural identifier and a smooth compensator. The neural identifier which contains a self-organizing RBF (SORBF) network with structure and parameter learning is designed to online estimate a system dynamics using the gradient descent method. The SORBF network can add new hidden neurons and prune insignificant hidden neurons online. The smooth compensator is designed to dispel the effect of minimum approximation error introduced by the neural identifier in the Lyapunov stability theorem. In general, how to determine the learning rate of parameter adaptation laws usually requires some trial-and-error tuning procedures. This paper proposes a dynamical learning rate approach based on a discrete-type Lyapunov function to speed up the convergence of tracking error. Finally, the proposed IASRNC system is applied to control two chaotic systems. Simulation results verify that the proposed IASRNC scheme can achieve a favorable tracking performance.  相似文献   

10.
This paper presents a novel classified self-organizing map method for edge preserving quantization of images using an adaptive subcodebook and weighted learning rate. The subcodebook sizes of two classes are automatically adjusted in training iterations based on modified partial distortions that can be estimated incrementally. The proposed weighted learning rate updates the neuron efficiently no matter of how large the weighting factor is. Experimental results show that the new method achieves better quality of reconstructed edge blocks and more spread out codebook and incurs a significantly less computational cost as compared to the competing methods.  相似文献   

11.
In this paper, we first discuss the meaning of physical embodiment and the complexity of the environment in the context of multi-agent learning. We then propose a vision-based reinforcement learning method that acquires cooperative behaviors in a dynamic environment. We use the robot soccer game initiated by RoboCup (Kitano et al., 1997) to illustrate the effectiveness of our method. Each agent works with other team members to achieve a common goal against opponents. Our method estimates the relationships between a learner's behaviors and those of other agents in the environment through interactions (observations and actions) using a technique from system identification. In order to identify the model of each agent, Akaike's Information Criterion is applied to the results of Canonical Variate Analysis to clarify the relationship between the observed data in terms of actions and future observations. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior policy. The proposed method is applied to a soccer playing situation. The method successfully models a rolling ball and other moving agents and acquires the learner's behaviors. Computer simulations and real experiments are shown and a discussion is given.  相似文献   

12.
The distributed autonomous robotic system has superiority of robustness and adaptability to dynamical environment, however, the system requires the cooperative behavior mutually for optimality of the system. The acquisition of action by reinforcement learning is known as one of the approaches when the multi-robot works with cooperation mutually for a complex task. This paper deals with the transporting problem of the multi-robot using Q-learning algorithm in the reinforcement learning. When a robot carries luggage, we regard it as that the robot leaves a trace to the own migrational path, which trace has feature of volatility, and then, the other robot can use the trace information to help the robot, which carries luggage. To solve these problems on multi-agent reinforcement learning, the learning control method using stress antibody allotment reward is used. Moreover, we propose the trace information of the robot to urge cooperative behavior of the multi-robot to carry luggage to a destination in this paper. The effectiveness of the proposed method is shown by simulation. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008  相似文献   

13.
Collaborative filtering has been widely applied in many fields in recent years due to the increase in web-based activities such as e-commerce and online content distribution. Current collaborative filtering techniques such as correlation-based, SVD-based and supervised learning-based approaches provide good accuracy, but are computationally very expensive and can only be deployed in static off-line settings, where the known rating information does not change with time. However, a number of practical scenarios require dynamic adaptive collaborative filtering that can allow new users, items and ratings to enter the system at a rapid rate. In this paper, we consider a novel adaptive personalized recommendation based on adaptive learning. Fast adaptive learning runs through all the aspects of the proposed approach, including training, prediction and updating. Empirical evaluation of our approach on Movielens dataset demonstrates that it is possible to obtain accuracy comparable to that of the correlation-based, SVD-based and supervised learning-based approaches at a much lower computational cost.  相似文献   

14.
Adaptive iterative learning control for robot manipulators   总被引:4,自引:0,他引:4  
In this paper, we propose some adaptive iterative learning control (ILC) schemes for trajectory tracking of rigid robot manipulators, with unknown parameters, performing repetitive tasks. The proposed control schemes are based upon the use of a proportional-derivative (PD) feedback structure, for which an iterative term is added to cope with the unknown parameters and disturbances. The control design is very simple in the sense that the only requirement on the PD and learning gains is the positive definiteness condition and the bounds of the robot parameters are not needed. In contrast to classical ILC schemes where the number of iterative variables is generally equal to the number of control inputs, the second controller proposed in this paper uses just two iterative variables, which is an interesting fact from a practical point of view since it contributes considerably to memory space saving in real-time implementations. We also show that it is possible to use a single iterative variable in the control scheme if some bounds of the system parameters are known. Furthermore, the resetting condition is relaxed to a certain extent for a certain class of reference trajectories. Finally, simulation results are provided to illustrate the effectiveness of the proposed controllers.  相似文献   

15.
In this paper, we apply concept learning techniques to solve a number of problems in the customer relationship management (CRM) domain. We present a concept learning technique to tackle common scenarios of interaction between conflicting human agents (such as customers and customer support representatives). Scenarios are represented by directed graphs with labeled vertices (for communicative actions) and arcs (for temporal and causal relationships between these actions and their parameters). The classification of a scenario is performed by comparing a partial matching of its graph with graphs of positive and negative examples. We illustrate machine learning of graph structures using the Nearest Neighbor approach and then proceed to JSM-based concept learning, which minimizes the number of false negatives and takes advantage of a more accurate way of matching sequences of communicative actions. Scenario representation and comparative analysis techniques developed herein are applied to the classification of textual customer complaints as a CRM component. In order to estimate complaint validity, we take advantage of the observation [19] that analyzing the structure of communicative actions without context information is frequently sufficient to judge how humans explain their behavior, in a plausible way or not. This paper demonstrates the superiority of concept learning in tackling human attitudes. Therefore, because human attitudes are domain-independent, the proposed concept learning approach is a good compliment to a wide range of CRM technologies where a formal treatment of inter-human interactions is required.  相似文献   

16.
In multi-agent systems, the study of language and communication is an active field of research. In this paper we present the application of Reinforcement Learning (RL) to the self-emergence of a common lexicon in robot teams. By modeling the vocabulary or lexicon of each agent as an association matrix or look-up table that maps the meanings (i.e. the objects encountered by the robots or the states of the environment itself) into symbols or signals we check whether it is possible for the robot team to converge in an autonomous, decentralized way to a common lexicon by means of RL, so that the communication efficiency of the entire robot team is optimal. We have conducted several experiments aimed at testing whether it is possible to converge with RL to an optimal Saussurean Communication System. We have organized our experiments alongside two main lines: first, we have investigated the effect of the team size centered on teams of moderated size in the order of 5 and 10 individuals, typical of multi-robot systems. Second, and foremost, we have also investigated the effect of the lexicon size on the convergence results. To analyze the convergence of the robot team we have defined the team’s consensus when all the robots (i.e. 100% of the population) share the same association matrix or lexicon. As a general conclusion we have shown that RL allows the convergence to lexicon consensus in a population of autonomous agents.  相似文献   

17.
Adaptive learning of specific patterns or events of interest has been an area of significant research for various applications in the last two decades. In developing diagnostic evaluation and safety monitoring applications of a propulsion system, it is critical to detect, characterize and model events of interest. It is a challenging task since the detection system should allow adaptive characterization of potential events of interest and correlate them to learn new models for future detection for online health monitoring and diagnostic evaluation. In this paper, a novel framework is established using a hierarchical adaptive clustering approach with fuzzy membership functions to characterize specific events of interest from the measured and processed features. Raw engine measurement data is first analyzed using the wavelet transform to provide features for localization of frequency information for use in the classification system. A method combining hierarchical and fuzzy k-means clustering is then applied to a set of selected measurements and computed features to determine the events of interest during engine operations. Experimental results have shown that the proposed approach is effective and computationally efficient to detect, characterize and model new events of interest from data collected through continuous operations.  相似文献   

18.
Many real scenarios in machine learning are of dynamic nature. Learning in these types of environments represents an important challenge for learning systems. In this context, the model used for learning should work in real time and have the ability to act and react by itself, adjusting its controlling parameters, even its structures, depending on the requirements of the process. In a previous work, the authors presented an online learning algorithm for two-layer feedforward neural networks that includes a factor that weights the errors committed in each of the samples. This method is effective in dynamic environments as well as in stationary contexts. As regards this method’s incremental feature, we raise the possibility that the network topology is adapted according to the learning needs. In this paper, we demonstrate and justify the suitability of the online learning algorithm to work with adaptive structures without significantly degrading its performance. The theoretical basis for the method is given and its performance is illustrated by means of its application to different system identification problems. The results confirm that the proposed method is able to incorporate units to its hidden layer, during the learning process, without high performance degradation.  相似文献   

19.
The design and management of human–automation teams for future air traffic systems require an understanding of principles of cognitive systems engineering, allocation of function and team adaptation. The current article proposes a framework of human–automation team adaptable control that incorporates adaptable automation [Oppermann, R., Simm, H., 1994. Adaptability: user-initiated individualization. In: Oppermann, R. (Ed.), Adaptive User Support: Ergonomic Design of Manually and Automatically Adaptable Software. Lawrence Erlbaum Associates, Hillsdale, NJ, pp. 14–64] with an Extended Control Model of Joint Cognitive System functioning [Hollnagel, E., Nåbo, A., Lau, I., 21–24 July 2003. A systemic model for driver-in-control. In: Paper Presented at the Second International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, Public Policy Center, University of Iowa, Park City, UT] nested within a dynamic view of team adaptation [Burke, C.S., Stagl, K.C., Salas, E., Pierce, L., Kendall, D., 2006. Understanding team adaptation: a conceptual analysis and model. Journal of Applied Psychology 91, 1189–1207]. Modeling the temporal dynamics of the coordination of human–automation teams under conditions of Free Flight requires an appreciation of the episodic, cyclical nature of team processes from transition to action phases, along with the distinction of team processes from emergent states [Marks, M.A., Mathieu, J.E., Zaccaro, S.J., 2001. A temporally based framework and taxonomy of team processes. Academy of Management Review 26, 356–376]. The conceptual framework of human–automation team adaptable control provides a basis for future research and design.

Relevance to industry

The current article provides a conceptual framework to direct future investigations to determine the optimal design and management of Human–automation teams for Free Flight-based air traffic management systems.  相似文献   

20.
This paper firstly proposes a bilateral optimized negotiation model based on reinforcement learning. This model negotiates on the issue price and the quantity, introducing a mediator agent as the mediation mechanism, and uses the improved reinforcement learning negotiation strategy to produce the optimal proposal. In order to further improve the performance of negotiation, this paper then proposes a negotiation method based on the adaptive learning of mediator agent. The simulation results show that the proposed negotiation methods make the efficiency and the performance of the negotiation get improved.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号