首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Rumelhan et al. (1986b) proposed a model of how symbolic processing may be achieved by parallel distributed processing (PDP) networks. Their idea is tested by training two types of recurrent networks to learn to add two numbers of arbitrary lengths. This turned out to be a fruitful exercise. We demonstrate: (1) that networks can learn simple programming constructs such as sequences, conditional branches and while loops; (2) that by lsquo;going sequential’ in this manner, we are able to process artibrarily long problems; (3) a manipulation of the training environment, called combined subset training (CST), that was found to be necessary to acquire a large training set; (4) a power difference between simple recurrent networks and Jordan networks by providing a simple procedure that one can learn and the other cannot.  相似文献   

2.
Representation poses important challenges to connectionism. The ability to compose representations structurally is critical in achieving the capability considered necessary for cognition. We are investigating distributed patterns that represent structure as part of a larger effort to develop a natural language processor. Recursive auto-associative memory (RAAM) representations show unusual promise as a general vehicle for representing classical symbolic structures in a way that supports compositionality. However, RAAMs are limited to representations for fixed-valence structures and can often be difficult to train. We provide a technique for mapping any ordered collection (forest) of hierarchical structures (trees) into a set of training patterns which can be used effectively in training a simple recurrent network (SRN) to develop RAAM-style distributed representations. The advantages in our technique are three-fold: first, the fixed-valence restriction on structures represented by patterns trained with RAAMs is removed; second, the representations resulting from training correspond to ordered forests of labeled trees, thereby extending what can be represented in this fashion; third, training can be accomplished with an auto-associative SRN, making training a much more straightforward process and one which optimally utilizes the n-dimensional space of patterns.  相似文献   

3.
Continuous-valued recurrent neural networks can learn mechanisms for processing context-free languages. The dynamics of such networks is usually based on damped oscillation around fixed points in state space and requires that the dynamical components are arranged in certain ways. It is shown that qualitatively similar dynamics with similar constraints hold for anbncn , a context-sensitive language. The additional difficulty with anbncn , compared with the context-free language anbn , consists of 'counting up' and 'counting down' letters simultaneously. The network solution is to oscillate in two principal dimensions, one for counting up and one for counting down. This study focuses on the dynamics employed by the sequential cascaded network, in contrast to the simple recurrent network, and the use of backpropagation through time. Found solutions generalize well beyond training data, however, learning is not reliable. The contribution of this study lies in demonstrating how the dynamics in recurrent neural networks that process context-free languages can also be employed in processing some context-sensitive languages (traditionally thought of as requiring additional computation resources). This continuity of mechanism between language classes contributes to our understanding of neural networks in modelling language learning and processing.  相似文献   

4.
It has been shown that the ability of echo state networks (ESNs) to generalise in a sentence-processing task can be increased by adjusting their input connection weights to the training data. We present a qualitative analysis of the effect of such weight adjustment on an ESN that is trained to perform the next-word prediction task. Our analysis makes use of CrySSMEx, an algorithm for extracting finite state machines (FSMs) from the data about the inputs, internal states, and outputs of recurrent neural networks that process symbol sequences. We find that the ESN with adjusted input weights yields a concise and comprehensible FSM. In contrast, the standard ESN, which shows poor generalisation, results in a massive and complex FSM. The extracted FSMs show how the two networks differ behaviourally. Moreover, poor generalisation is shown to correspond to a highly fragmented quantisation of the network's state space. Such findings indicate that CrySSMEx can be a useful tool for analysing ESN sentence processing.  相似文献   

5.
Fodor and Pylyshyn [(1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1–2), 3–71] famously argued that neural networks cannot behave systematically short of implementing a combinatorial symbol system. A recent response from Frank et al. [(2009). Connectionist semantic systematicity. Cognition, 110(3), 358–379] claimed to have trained a neural network to behave systematically without implementing a symbol system and without any in-built predisposition towards combinatorial representations. We believe systems like theirs may in fact implement a symbol system on a deeper and more interesting level: one where the symbols are latent – not visible at the level of network structure. In order to illustrate this possibility, we demonstrate our own recurrent neural network that learns to understand sentence-level language in terms of a scene. We demonstrate our model's learned understanding by testing it on novel sentences and scenes. By paring down our model into an architecturally minimal version, we demonstrate how it supports combinatorial computation over distributed representations by using the associative memory operations of Vector Symbolic Architectures. Knowledge of the model's memory scheme gives us tools to explain its errors and construct superior future models. We show how the model designs and manipulates a latent symbol system in which the combinatorial symbols are patterns of activation distributed across the layers of a neural network, instantiating a hybrid of classical symbolic and connectionist representations that combines advantages of both.  相似文献   

6.
赵志宏  吴冬冬 《机床与液压》2023,51(22):202-208
针对轴承故障诊断中故障样本稀缺、深度神经网络模型在小样本条件下存在故障诊断准确度较低的问题,提出将深度神经网络扩展为孪生网络结构的框架,以提高在小样本条件下的故障诊断性能。孪生网络通过权值共享的骨干网络从样本对中提取特征,采用L1距离判定样本对的特征相似度,实现轴承故障诊断。不同于传统深度神经网络,孪生网络采取输入样本对的方法,在故障数据不足的情况下,可以提高轴承故障诊断性能。分别将不同层数的卷积神经网络(CNN)与长短期记忆网络(LSTM)扩展为孪生网络结构,在实测轴承数据集上进行小样本故障诊断实验。实验结果表明,通过扩展为孪生网络结构可以提高故障诊断结果的准确率,孪生CNN网络比对应的CNN网络准确率平均提高1.08%,孪生LSTM网络比对应的LSTM网络准确率平均提高4.78%。  相似文献   

7.
This new work is an extension of existing research into artificial neural networks (Neville and Stonham, Connection Sci.: J. Neural Comput. Artif. Intell. Cognitive Res., 7, pp. 29–60, 1995; Neville, Neural Net., 45, pp. 375–393, 2002b). These previous studies of the reuse of information (Neville, IEEE World Congress on Computational Intelligence, 1998b, pp. 1377–1382; Neville and Eldridge, Neural Net., pp. 375–393, 2002; Neville, IEEE World Congress on Computational Intelligence, 1998c, pp. 1095–1100; Neville, IEEE 2003 International Joint Conference on Neural Networks, 2003; Neville, IEEE IJCNN'04, 2004 International Joint Conference on Neural Networks, 2004) are associated with a methodology that prescribes the weights, as opposed to training them. In addition, they work with smaller networks. Here, this work is extended to include larger nets. This methodology is considered in the context of artificial neural networks: geometric reuse of information is described mathematically and then validated experimentally. The theory shows that the trained weights of a neural network can be used to prescribe the weights of other nets of the same architecture. Hence, the other nets have prescribed weights that enable them to map related geometric functions. This means the nets are a method of ‘reuse of information’. This work is significant in that it validates the statement that, ‘knowledge encapsulated in a trained multi-layer sigma-pi neural network (MLSNN) can be reused to prescribe the weights of other MLSNNs which perform similar tasks or functions’. The important point to note here is that the other MLSNNs weights are prescribed in order to represent related functions. This implies that the knowledge encapsulated in the initially trained MLSNN is of more use than may initially appear.  相似文献   

8.
In this paper, we propose a new type of efficient learning method called teacher-directed learning. The method can accept training patterns and correlated teachers, and we need not back-propagate errors between targets and outputs into networks. Information flows always from an input layer to an output layer. In addition, connections to be updated are those from an input layer to the first competitive layer. All other connections can take fixed values. Learning is realized as a competitive process by maximizing information on training patterns and correlated teachers. Because information is maximized, information is compressed into networks in simple ways, which enables us to discover salient features in input patterns. We applied this method to the vertical and horizontal lines detection problem, the analysis of US–Japan trade relations and a fairly complex syntactic analysis system. Experimental results confirmed that teacher information in an input layer forces networks to produce correct answers. In addition, because of maximized information in competitive units, easily interpretable internal representations can be obtained.  相似文献   

9.
张延华  刘相华  王国栋 《轧钢》2005,22(3):8-11
分析了中厚板板凸度计算模型并给出了相应的在线数学模型。为了提高板凸度在线模型预测精度,提出了一种基于模糊聚类BP神经网络的板凸度模型影响系数的优化方法。并采用模糊聚类分析方法,科学选取学习样本,解决了由于样本多学习速度慢的问题。通过对大量在线数据分析得出,这种方法对中厚板板凸度的预报精度有很大改善,能适应不断变化的工艺过程和设备条件。  相似文献   

10.
Unsupervised topological ordering, similar to Kohonen's (1982, Biological Cybernetics, 43: 59-69) self-organizing feature map, was achieved in a connectionist module for competitive learning (a CALM Map) by internally regulating the learning rate and the size of the active neighbourhood on the basis of input novelty. In this module, winner-take-all competition and the 'activity bubble' are due tograded lateral inhibition between units. It tends to separate representations as far apart as possible, which leads to interpolation abilities and an absence of catastrophic interference when the interfering set of patterns forms an interpolated set of the initial data set. More than the Kohonen maps, these maps provide an opportunity for building psychologically and neurophysiologically motivated multimodular connectionist models. As an example, the dual pathway connectionist model for fear conditioning by Armony et al. (1997, Trends in Cognitive Science, 1: 28-34) was rebuilt and extended with CALM Maps. If the detection of novelty enhances memory encoding in a canonical circuit, such as the CALM Map, this could explain the finding of large distributed networks for novelty detection (e.g. Knight and Scabini, 1998, Journal of Clinical Neurophysiology, 15: 3-13) in the brain.  相似文献   

11.
Recurrent neural networks readily process, learn and generate temporal sequences. In addition, they have been shown to have impressive computational power. Recurrent neural networks can be trained with symbolic string examples encoded as temporal sequences to behave like sequential finite slate recognizers. We discuss methods for extracting, inserting and refining symbolic grammatical rules for recurrent networks. This paper discusses various issues: how rules are inserted into recurrent networks, how they affect training and generalization, and how those rules can be checked and corrected. The capability of exchanging information between a symbolic representation (grammatical rules)and a connectionist representation (trained weights) has interesting implications. After partially known rules are inserted, recurrent networks can be trained to preserve inserted rules that were correct and to correct through training inserted rules that were ‘incorrec’—rules inconsistent with the training data.  相似文献   

12.
Effect of carbon, compound RE, quenching temperature, pre-strain and recovery temperature on shape memory effect (SME) of Fe-Mn-Si-Ni-C-RE shape memory alloy was studied by bent measurement, thermal cycle training, SEM etc. It was shown that the grains of alloys addition with compound RE became finer and SME increased evidently. SME of the alloy was weakening gradually as carbon content increased under small strain (3%). But in the condition of large strain (more than 6%), SME of the alloy whose carbon content range from 0.1% to 0.12% showed small decreasing range, especially of alloy with the addition of compound RE. Results were also indicated that SME was improved by increasing quenching temperature (>1000℃). The amount of thermal induced martensite increased and the relative shape recovery ratio could be increased to more than 40% after 3-4 times thermal training. The relative shape recovery ratio decreased evidently depending on rising of pre-strain. Furthermore, because speed of martensite transi  相似文献   

13.
An astronomical set of sentences can be produced in natural language by combining relatively simple sentence structures with a human-size lexicon. These sentences are within the range of human language performance. Here, we investigate the ability of simple recurrent networks (SRNs) to handle such combinatorial productivity. We successfully trained SRNs to process sentences formed by combining sentence structures with different groups of words. Then, we tested the networks with test sentences in which words from different training sentences were combined. The networks failed to process these sentences, even though the sentence structures remained the same and all words appeared on the same syntactic positions as in the training sentences. In these combination cases, the networks produced work–word associations, similar to the condition in which words are presented in the context of a random word sequence. The results show that SRNs have serious difficulties in handling the combinatorial productivity that underlies human language performance. We discuss implications of this result for a potential neural architecture of human language processing.  相似文献   

14.
粗轧过程轧制力BP神经网络预报   总被引:3,自引:0,他引:3  
利用BP神经网络 ,以某热轧厂粗轧机组数据库中的数据为训练样本 ,采用两种训练方案 ,对粗轧过程轧制力进行预测。BP网络的预报精度 ,既与训练样本的选取有关 ,又与隐层节点的个数以及相对化系数的大小有着密切的联系。以上因素选取得当 ,能够提高网络的预报精度 ,若选取不当 ,则降低网络的预报精度  相似文献   

15.
16.
图像识别技术广泛应用于涂层领域,图像特征的选择是提升识别率的重要因素,而形状特征在涂层锈点的图像识别中未见报道。基于涂层锈点的颜色和形状特征,结合机器学习对其进行图像识别。通过采集 3 种常见自然光照强度下的 90 张涂层锈点图像,使用同态滤波对图像进行预处理,利用 HSV(色相-饱和度-明度)颜色空间来区分锈点与无锈点区域。然后提取锈点的 8 种形状特征对锈点区域进一步细化,用 Pearson 相关系数对形状特征进行筛选,将颜色特征、单一形状特征、8 种组合形状特征、筛选后的组合形状特征、颜色特征与筛选后组合形状特征的融合特征分别作为参量输入 Linear 核函数、RBF 核函数、Polynomial 核函数和 Sigmoid 核函数 4 种核函数的支持向量机(SVM)对锈点进行识别。研究结果表明:联合 SVM 与颜色、形状特征参量构建的图像识别算法能较准确地识别涂层锈点,其中基于颜色特征与筛选后形状特征的融合特征的准确识别率最高可达 93.33%。形状特征可作为另一种特征信息来提高锈点图像识别的精确度,可为涂层锈点的图像识别技术研究提供参考依据。  相似文献   

17.
主要研究了激光-TIG电弧复合焊复合模式、热源间距(Dla)和激光脉冲参数对镁合金T形件成形的影响.研究发现TIG-激光的复合模式更有利于T形件的成形.在TIG-激光复合模式下,Dla随着焊接电流的提高而增加时会得到更大的焊接熔深,焊接电流为100 A,Dla为3 mm时可以得到良好的镁合金T形结构件焊缝成形.激光脉宽过小或过大都会对T形结构件焊缝根部的成形产生不良影响,脉冲较小时T形件立板与上板结合处的熔池金属极易产生过烧,从而导致焊接表面塌陷严重,甚至烧穿;脉冲过大时则造成立板与上板结合处熔化不足而产生不熔合现象.  相似文献   

18.
Reduction in the size and complexity of neural networks is essential to improve generalization, reduce training error and improve network speed. Most of the known optimization methods heavily rely on weight-sharing concepts for pattern separation and recognition. In weight-sharing methods the redundant weights from specific areas of input layer are pruned and the value of weights and their information content play a very minimal role in the pruning process. The method presented here focuses on network topology and information content for optimization. We have studied the change in the network topology and its effects on information content dynamically during the optimization of the network. The primary optimization uses scaled conjugate gradient and the secondary method of optimization is a Boltzmann method. The conjugate gradient optimization serves as a connection creation operator and the Boltzmann method serves as a competitive connection annihilation operator. By combining these two methods, it is possible to generate small networks which have similar testing and training accuracy, i.e. good generalization, from small training sets. In this paper, we have also focused on network topology. Topological separation is achieved by changing the number of connections in the network. This method should be used when the size of the network is large enough to tackle real-life problems such as fingerprint classification. Our findings indicate that for large networks, topological separation yields a smaller network size, which is more suitable for VLSI implementation. Topological separation is based on the error surface and information content of the network. As such it is an economical way of reducing size, leading to overall optimization. The differential pruning of the connections is based on the weight content rather than the number of connections. The training error may vary with the topological dynamics but the correlation between the error surface and recognition rate decreases to a minimum. Topological separation reduces the size of the network by changing its architecture without degrading its performance,  相似文献   

19.
Gao  Qiang  Liu  Li-rong  Tang  Xiao-hua  Peng  Zhi-jiang  Zhang  Ming-jun  Tian  Su-gui 《中国铸造》2019,16(1):14-22
Interfacial dislocations found in single crystal superalloys after long term thermal aging have an important effect on mechanical properties. Long term thermal aging tests for DD5 single crystal superalloy were carried out at 1,100 ℃ for 20, 100, 200, 500 and 1000 h, and then cooled by air. The effect of long term thermal aging on the dislocation networks at the γ/γ' interfaces was investigated by FE-SEM. Results showed that during the long term thermal aging at 1,100 ℃, misfit dislocations formed firstly and then reorientation in the(001) interfacial planes occurred. Different types of square or rectangular dislocation network form by dislocation reaction. Square dislocation networks consisting of four groups of dislocations can transform into octagonal dislocation networks, and then form another square dislocation network by dislocation reaction. Rectangular dislocation networks can also transform into hexagonal dislocation networks. The interfacial dislocation networks promote the γ' phase rafting process. The dislocation networks spacings become smaller and smaller, leading to the effective lattice misfit increasing from-0.10% to-0.32%.  相似文献   

20.
This paper focuses on adaptive motor control in the kinematic domain. Several motor-learning strategies from the literature are adopted to kinematic problems: ‘feedback-error learning’, ‘distal supervised learning’, and ‘direct inverse modelling’ (DIM). One of these learning strategies, DIM, is significantly enhanced by combining it with abstract recurrent neural networks. Moreover, a newly developed learning strategy (‘learning by averaging’) is presented in detail. The performance of these learning strategies is compared with different learning tasks on two simulated robot setups (a robot-camera-head and a planar arm). The results indicate a general superiority of DIM if combined with abstract recurrent neural networks. Learning by averaging shows consistent success if the motor task is constrained by special requirements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号