首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 814 毫秒
1.
Solving a finite Markov decision process using techniques from dynamic programming such as value or policy iteration require a complete model of the environmental dynamics. The distribution of rewards, transition probabilities, states and actions all need to be fully observable, discrete and complete. For many problem domains, a complete model containing a full representation of the environmental dynamics may not be readily available. Bayesian reinforcement learning (RL)\ is a technique devised to make better use of the information observed through learning than simply computing Q-functions. However, this approach can often require extensive experience in order to build up an accurate representation of the true values. To address this issue, this paper proposes a method for parallelising a Bayesian RL technique aimed at reducing the time it takes to approximate the missing model. We demonstrate the technique on learning next state transition probabilities without prior knowledge. The approach is general enough for approximating any probabilistically driven component of the model. The solution involves multiple learning agents learning in parallel on the same task. Agents share probability density estimates amongst each other in an effort to speed up convergence to the true values.  相似文献   

2.
Interactive reinforcement learning (IRL) has become an important apprenticeship approach to speed up convergence in classic reinforcement learning (RL) problems. In this regard, a variant of IRL is policy shaping which uses a parent-like trainer to propose the next action to be performed and by doing so reduces the search space by advice. On some occasions, the trainer may be another artificial agent which in turn was trained using RL methods to afterward becoming an advisor for other learner-agents. In this work, we analyse internal representations and characteristics of artificial agents to determine which agent may outperform others to become a better trainer-agent. Using a polymath agent, as compared to a specialist agent, an advisor leads to a larger reward and faster convergence of the reward signal and also to a more stable behaviour in terms of the state visit frequency of the learner-agents. Moreover, we analyse system interaction parameters in order to determine how influential they are in the apprenticeship process, where the consistency of feedback is much more relevant when dealing with different learner obedience parameters.  相似文献   

3.
Advanced autonomous artificial systems will need incremental learning and adaptive abilities similar to those seen in humans. Knowledge from biology, psychology and neuroscience is now inspiring new approaches for systems that have sensory-motor capabilities and operate in complex environments. Eye/hand coordination is an important cross-modal cognitive function, and is also typical of many of the other coordinations that must be involved in the control and operation of embodied intelligent systems. This paper examines a biologically inspired approach for incrementally constructing compact mapping networks for eye/hand coordination. We present a simplified node-decoupled extended Kalman filter for radial basis function networks, and compare this with other learning algorithms. An experimental system consisting of a robot arm and a pan-and-tilt head with a colour camera is used to produce results and test the algorithms in this paper. We also present three approaches for adapting to structural changes during eye/hand coordination tasks, and the robustness of the algorithms under noise are investigated. The learning and adaptation approaches in this paper have similarities with current ideas about neural growth in the brains of humans and animals during tool-use, and infants during early cognitive development.  相似文献   

4.
ABSTRACT

Machine learning based mobile traffic classification has become a popular topic in recent years. As mobile traffic data is dynamic in nature, the static model has become ineffective for the task of classifying future traffic. This is known as the concept drift problem in data streams. To this end, this paper presents an adaptive mobile traffic classification method. Specifically, a method based on the fuzzy competence model is devised to detect concept drift, and a dynamic learning method is presented to update the classification model, so as to adapt to an ever-changing environment at an appropriate time. The concept drift detection method relies on the data distribution instead of the classification error rate. Furthermore, the weights of flow samples are dynamically updated and flow samples are resampled for training a new model when a concept drift is detected. Moreover, recently trained models are saved and used for classification in weighted voting. The weight of each model is updated according to the performance it obtains on the most recent flow samples. On mobile traffic data, experimental results show that our proposed method obtains lower classification error rate with less time consumption on updating models as compared to related methods designed for handling concept drift problems.  相似文献   

5.
In this paper, we propose an extension to the recursive auto-associative memory (RAAM) by Pollack. This extension, the labelling RAAM (LRAAM), can encode labelled graphs with cycles by representing pointers explicitly. Some technical problems encountered in the RAAM, such as the termination problem in the learning and decoding processes, are solved more naturally in the LRAAM framework. The representations developed for the pointers seem to be robust to recurrent decoding along a cycle. Theoretical and experimental results show that the performances of the proposed learning scheme depend on the way the graphs are represented in the training set. Critical features for the representation are cycles and confluent pointers. Data encoded in a LRAAM can be accessed by a pointer as well as by content. Direct access by content can be achieved by transforming the encoder network of the LRAAM into a particular bidirectional associative memory (BAM). Statistics performed on different instances of LRAAM show a strict connection between the associated BAM and a standard BAM. Different access procedures can be defined depending on the access key. The access procedures are not wholly reliable; however, they seem to have a good success rate. The generalization test for the RAAM is no longer complete for the LRAAM. Some suggestions on how to solve this problem are given. Some results on modular LRAAM, stability and application to neural dynamics control are summarized.  相似文献   

6.
The detection of internal defects in composite materials with non-destructive techniques is an important requirement both for quality checks during the production phase and in-service inspection during maintenance operations. Visual inspection allows only the analysis of surface characteristics of materials and, then, if internal faults occur inside composite structures, a deeper analysis is required. A comparison between the reactions of different materials to ultrasonic signals can be used to highlight the difference in the internal structures and also to detect the depth position of these anomalies. However, ultrasonic data are difficult to interpret since they require the analysis of a continuous signal for each point of the material under consideration. An automatic procedure is necessary to manage large data sets and to extract significant differences between them.In this paper, we address the problem of automatic inspection of composite materials using an ultrasonic technique. We consider two main steps for interpreting ultrasonic data: the pre-processing technique necessary to normalize the signals of composite structures with different thicknesses and the classification techniques used to compare the ultrasonic signals and detect classes of similar points.  相似文献   

7.
Existing complexity measures from contemporary learning theory cannot be conveniently applied to specific learning problems (e.g. training sets). Moreover, they are typically non-generic, i.e. they necessitate making assumptions about the way in which the learner will operate. The lack of a satisfactory, generic complexity measure for learning problems poses difficulties for researchers in various areas; the present paper puts forward an idea which may help to alleviate these. It shows that supervised learning problems fall into two generic complexity classes, only one of which is associated with computational tractability. By determining which class a particular problem belongs to, we can thus effectively evaluate its degree of generic difficulty.  相似文献   

8.
Coordinating a supply chain necessitates a synchronization strategy for reordering products and a cost-effective production and replenishment cycle time. The aim of this paper is to present an optimization framework for producing and distribution in the supply chains with a cooperating strategy. The main contribution of this paper is to integrate closed loop supply chain with open-shop manufacturing and economic lot and delivery scheduling problem (ELDSP). This integration is applied with the aim of better coordination between the members of the supply chain. This study examines the ELDSP for a multi-stage closed loop supply chain, where each product is returned to a manufacturing center at a constant rate of demand. The supply chain is also characterized by a sub-open-shop system for remanufacturing returned items. Common cycle time and multiplier policies is adopted to accomplish the desired synchronization. For this purpose, we developed a mathematical model in which a manufacturer with an open-shop system purchases raw materials from suppliers, converts them into final products, and sends them to package companies. Given that the ELDSPR is an NP-hard problem, a simulated annealing (SA) algorithm and a biography-based optimization (BBO) algorithm is developed. Two operational scenarios are formulated for the simulated annealing algorithm, after which both the algorithms are used to solve problems of different scales. The numerical results show that the biography-based optimization algorithm excellently performs in finding the best solution to the ELDSPR.  相似文献   

9.
Rationalism has been referred to as the tradition of explaining cognition in terms of logical structures. Much of the work in traditional AI can be seen within a rationalistic framework. Because of the problems with traditional AI, connectionist models have been proposed as an alternative. Connectionist models do solve a number of problems of AI in interesting ways, e.g. learning, generalization, and fault and noise tolerance. However, they do not automatically provide solutions to the basic conceptual problems which can be traced back to a neglect of the relation of AI systems with the real world. We will argue that if we are to make progress in the understanding of (intelligent) behavior the real issue is not whether connectionism is a better paradigm for cognitive science than traditional AI but whether a rationalistic perspective is appropriate and if not what the alternatives are. It is suggested that studying physically instantiated autonomous agents is an important step. However, we will show that building autonomous agents alone does not solve the problem either. What is needed is an appropriate embedding in a non-rationalistic framework. We will discuss a potential solution using an approach we have been developing in our group, called ‘distributed adaptive control’.  相似文献   

10.
In this paper we explore the topic of the consolidation of information in neural network learning. One problem in particular has limited the ability of a broad range of neural networks to perform ongoing learning and consolidation. This is 'catastrophic forgetting', the tendency for new information, when it is learned, to disrupt old information. We will review and slightly extend the rehearsal and pseudorehearsal solutions to the catastrophic forgetting problem presented in Robins (1995). The main focus of this paper is to then relate these mechanisms to the consolidation processes which have been proposed in the psychological literature regarding sleep. We suggest that the catastrophic forgetting problem in artificial neural networks (ANNs) is a problem that has actually occurred in the evolution of the mammalian brain, and that the pseudorehearsal solution to the problem in ANNs is functionally equivalent to the sleep consolidation solution adopted by the brain. Finally, we review related work by McClelland et al. (1995) and propose a tentative model of learning and sleep that emphasizes consolidation mechanisms and the role of the hippocampus.  相似文献   

11.
《Acta Materialia》2005,53(8):2295-2304
The sizes of large inclusions within a cast of hard steel have a major influence on fatigue characteristics, but are not directly measurable by routine means. Recently, two methods have been proposed for estimation of the size distribution of large inclusions on the basis of measurements made on the sections of inclusions revealed in samples from a polished plane surface of the material. This paper reviews the two methods, showing that they are closely related and that properties found by each may be deduced from those found by the other. The paper also discusses the problem of inferring the distribution of the projected (three-dimensional) size of large inclusions from measurements made by either of the methods on sectional (two-dimensional) sizes. A simple new approximate solution to this stereological problem is proposed and is compared to existing approaches.  相似文献   

12.
选区激光熔化(SLM)作为现代工业构件制造的一种主流技术,广泛应用于汽车、航空航天及医学等领域,对SLM工艺的监测及闭环控制方式进行系统梳理变得极为重要。针对SLM技术原理及熔池变化,从SLM成形过程中的熔池温度和形貌特征综述选区激光熔化监测技术发展进程及不足,分析闭环反馈技术的研究现状。研究表明:SLM加工过程中熔池的变化状态是影响成形件质量的重要因素,通过光信号、声信号或多信号传感器可对熔池状态进行有效监测,而闭环控制需要算法分析、机器学习及传感器的协同配合才能实现实时反馈及控制。根据当前监测技术的实时性较差及系统反馈控制不够完善等问题,提出未来智能监测技术与实时闭环控制等发展方向,可为未来SLM成形高质量零件提供参考借鉴。  相似文献   

13.
谭亚红  史耀 《机床与液压》2022,50(14):182-188
针对传统滚动轴承故障诊断方法难以提取和辨识故障特征等问题,提出一种完备变分模态分解(CVMD)和工业多传感器卷积神经网络(MSCNN)相结合的轴承故障识别模型。在采集到的滚动轴承故障振动数据中加入2对符号相反但幅值相等的白噪声,并使用变分模态分解将故障振动数据分解为若干本征模态分量(IMFs)并进行集成平均;利用综合指标选择合适的IMFs分量并重构;针对多传感器结构,在卷积神经网络的基础上,提出MSCNN网络,并将重构后的振动信号输入MSCNN进行自动特征学习与故障诊断。结果表明:所提出的CVMD-MSCNN模型的故障诊断准确率达99.76%,标准差为0.16,相比于其他深度学习方法,其诊断准确率和稳定性较优。  相似文献   

14.
To help overcome the problem of horizontal-axis wind-turbine (HAWT) gear-box roller-bearing premature-failure, the root causes of this failure are currently being investigated using mainly laboratory and field-test experimental approaches. In the present work, an attempt is made to develop complementary computational methods and tools which can provide additional insight into the problem at hand (and do so with a substantially shorter turn-around time). Toward that end, a multi-physics computational framework has been developed which combines: (a) quantum-mechanical calculations of the grain-boundary hydrogen-embrittlement phenomenon and hydrogen bulk/grain-boundary diffusion (the two phenomena currently believed to be the main contributors to the roller-bearing premature-failure); (b) atomic-scale kinetic Monte Carlo-based calculations of the hydrogen-induced embrittling effect ahead of the advancing crack-tip; and (c) a finite-element analysis of the damage progression in, and the final failure of a prototypical HAWT gear-box roller-bearing inner raceway. Within this approach, the key quantities which must be calculated using each computational methodology are identified, as well as the quantities which must be exchanged between different computational analyses. The work demonstrates that the application of the present multi-physics computational framework enables prediction of the expected life of the most failure-prone HAWT gear-box bearing elements.  相似文献   

15.
The real-time recurrent learning algorithm is a gradient-following learning algorithm for completely recurrent networks running in continually sampled time. Here we use a series of simulation experiments to investigate the power and properties of this algorithm. In the recurrent networks studied here, any unit can be connected to any other, and any unit can receive external input. These networks run continually in the sense that they sample their inputs on every update cycle, and any unit can have a training target on any cycle. The storage required and computation time on each step are independent of time and are completely determined by the size of the network, so no prior knowledge of the temporal structure of the task being learned is required. The algorithm is nonlocal in the sense that each unit must have knowledge of the complete recurrent weight matrix and error vector. The algorithm is computationally intensive in sequential computers, requiring a storage capacity of the order of the third power of the number of units and a computation time on each cycle of the order of the fourth power of the number of units. The simulations include examples in which networks are taught tasks not possible with tapped delay lines—that is, tasks that require the preservation of state over potentially unbounded periods of time. The most complex example of this kind is learning to emulate a Turing machine that does a parenthesis balancing problem. Examples are also given of networks that do feedforward computations with unknown delays, requiring them to organize into networks with the correct number of layers. Finally, examples are given in which networks are trained to oscillate in various ways, including sinusoidal oscillation.  相似文献   

16.
This paper analyses a three-layer connectionist network that solves a translation-invariance problem, offering a novel explanation for transposed letter effects in word reading. Analysis of the hidden unit encodings provides insight into two central issues in cognitive science: (1) What is the novelty of claims of “modality-specific” encodings? and (2) How can a learning system establish a complex internal structure needed to solve a problem? Although these topics (embodied cognition and learnability) are often treated separately, we find a close relationship between them: modality-specific features help the network discover an abstract encoding by causing it to break the initial symmetries of the hidden units in an effective way. While this neural model is extremely simple compared to the human brain, our results suggest that neural networks need not be black boxes and that carefully examining their encoding behaviours may reveal how they differ from classical ideas about the mind-world relationship.  相似文献   

17.
18.
Humans and other animals have been shown to perform near-optimally in multi-sensory integration tasks. Probabilistic population codes (PPCs) have been proposed as a mechanism by which optimal integration can be accomplished. Previous approaches have focussed on how neural networks might produce PPCs from sensory input or perform calculations using them, like combining multiple PPCs. Less attention has been given to the question of how the necessary organisation of neurons can arise and how the required knowledge about the input statistics can be learned. In this paper, we propose a model of learning multi-sensory integration based on an unsupervised learning algorithm in which an artificial neural network learns the noise characteristics of each of its sources of input. Our algorithm borrows from the self-organising map the ability to learn latent-variable models of the input and extends it to learning to produce a PPC approximating a probability density function over the latent variable behind its (noisy) input. The neurons in our network are only required to perform simple calculations and we make few assumptions about input noise properties and tuning functions. We report on a neurorobotic experiment in which we apply our algorithm to multi-sensory integration in a humanoid robot to demonstrate its effectiveness and compare it to human multi-sensory integration on the behavioural level. We also show in simulations that our algorithm performs near-optimally under certain plausible conditions, and that it reproduces important aspects of natural multi-sensory integration on the neural level.  相似文献   

19.
The task of providing robust vision for autonomous mobile robots is a complex signal processing problem which cannot be solved using traditional deterministic computing techniques. In this article we investigate four unsupervised neural learning algorithms, known collectively as competitive learning, in order to assess both their theoretical operation and their ability to learn to represent a basic robotic vision task. This task involves the ability of a modest robotic system to identify the components of basic motion and to generalize upon that learned knowledge to classify correctly novel visual experiences. This investigation shows that standard competitive learning and the DeSieno version of frequency-sensitive competitive learning (FSCL) are unsuitable for solving this problem. Soft competitive learning, while capable of producing an appropriate solution, is too computationally expensive in its present form to be used under the constraints of this application. However, the Krishnamurthy version of FSCL is found to be both computationally efficient and capable of reliably learning a suitable solution to the motion identification problem both in simulated tests and in actual hardware-based experiments.  相似文献   

20.
An overview of approaches to end milling tool monitoring   总被引:1,自引:0,他引:1  
The increase in awareness regarding the need to optimise manufacturing process efficiency has led to a great deal of research aimed at machine tool condition monitoring. This paper considers the application of condition monitoring techniques to the detection of cutting tool wear and breakage during the milling process. Established approaches to the problem are considered and their application to the next generation of monitoring systems is discussed. Two approaches are identified as being key to the industrial application of operational tool monitoring systems.Multiple sensor systems, which use a wide range of sensors with an increasing level of intelligence, are seen as providing long-term benefits, particularly in the field of tool wear monitoring. Such systems are being developed by a number of researchers in this area. The second approach integrates the control signals used by the machine controller into a process monitoring system which is capable of detecting tool breakage. Initial findings mainly under laboratory conditions, indicate that both these approaches can be of major benefit. It is finally argued that a combination of these approaches will ultimately lead to robust systems which can operate in an industrial environment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号