首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Rule revision with recurrent neural networks   总被引:2,自引:0,他引:2  
Recurrent neural networks readily process, recognize and generate temporal sequences. By encoding grammatical strings as temporal sequences, recurrent neural networks can be trained to behave like deterministic sequential finite-state automata. Algorithms have been developed for extracting grammatical rules from trained networks. Using a simple method for inserting prior knowledge (or rules) into recurrent neural networks, we show that recurrent neural networks are able to perform rule revision. Rule revision is performed by comparing the inserted rules with the rules in the finite-state automata extracted from trained networks. The results from training a recurrent neural network to recognize a known non-trivial, randomly-generated regular grammar show that not only do the networks preserve correct rules but that they are able to correct through training inserted rules which were initially incorrect (i.e. the rules were not the ones in the randomly generated grammar)  相似文献   

2.
Recurrent neural networks processing symbolic strings can be regarded as adaptive neural parsers. Given a set of positive and negative examples, picked up from a given language, adaptive neural parsers can effectively be trained to infer the language grammar. In this paper we use adaptive neural parsers to face the problem of inferring grammars from examples that are corrupted by a kind of noise that simply changes their membership. We propose a training algorithm, referred to as hybrid finite state filter, which is based on a parsimony principle that penalizes the development of complex rules. We report very promising experimental results showing that the proposed inductive inference scheme is indeed capable of capturing rules, while removing noise.  相似文献   

3.
This paper examines the inductive inference of a complex grammar with neural networks and specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky (1956), in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagation-through-time training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated  相似文献   

4.
Neural networks are often employed as tools in classification tasks. The use of large networks increases the likelihood of the task's being learned, although it may also lead to increased complexity. Pruning is an effective way of reducing the complexity of large networks. We present discriminant components pruning (DCP), a method of pruning matrices of summed contributions between layers of a neural network. Attempting to interpret the underlying functions learned by the network can be aided by pruning the network. Generalization performance should be maintained at its optimal level following pruning. We demonstrate DCP's effectiveness at maintaining generalization performance, applicability to a wider range of problems, and the usefulness of such pruning for network interpretation. Possible enhancements are discussed for the identification of the optimal reduced rank and inclusion of nonlinear neural activation functions in the pruning algorithm.  相似文献   

5.
《Artificial Intelligence》2001,125(1-2):155-207
Although neural networks have shown very good performance in many application domains, one of their main drawbacks lies in the incapacity to provide an explanation for the underlying reasoning mechanisms.The “explanation capability” of neural networks can be achieved by the extraction of symbolic knowledge. In this paper, we present a new method of extraction that captures nonmonotonic rules encoded in the network, and prove that such a method is sound.We start by discussing some of the main problems of knowledge extraction methods. We then discuss how these problems may be ameliorated. To this end, a partial ordering on the set of input vectors of a network is defined, as well as a number of pruning and simplification rules. The pruning rules are then used to reduce the search space of the extraction algorithm during a pedagogical extraction, whereas the simplification rules are used to reduce the size of the extracted set of rules. We show that, in the case of regular networks, the extraction algorithm is sound and complete.We proceed to extend the extraction algorithm to the class of non-regular networks, the general case. We show that non-regular networks always contain regularities in their subnetworks. As a result, the underlying extraction method for regular networks can be applied, but now in a decompositional fashion. In order to combine the sets of rules extracted from each subnetwork into the final set of rules, we use a method whereby we are able to keep the soundness of the extraction algorithm.Finally, we present the results of an empirical analysis of the extraction system, using traditional examples and real-world application problems. The results have shown that a very high fidelity between the extracted set of rules and the network can be achieved.  相似文献   

6.
基于神经网络结构学习的知识求精方法   总被引:5,自引:0,他引:5  
知识求精是知识获取中必不可少的步骤.已有的用于知识求精的KBANN(know ledge based artificialneuralnetw ork)方法,主要局限性是训练时不能改变网络的拓扑结构.文中提出了一种基于神经网络结构学习的知识求精方法,首先将一组规则集转化为初始神经网络,然后用训练样本和结构学习算法训练初始神经网络,并提取求精的规则知识.网络拓扑结构的改变是通过训练时采用基于动态增加隐含节点和网络删除的结构学习算法实现的.大量实例表明该方法是有效的  相似文献   

7.
Concerns the effect of noise on the performance of feedforward neural nets. We introduce and analyze various methods of injecting synaptic noise into dynamically driven recurrent nets during training. Theoretical results show that applying a controlled amount of noise during training may improve convergence and generalization performance. We analyze the effects of various noise parameters and predict that best overall performance can be achieved by injecting additive noise at each time step. Noise contributes a second-order gradient term to the error function which can be viewed as an anticipatory agent to aid convergence. This term appears to find promising regions of weight space in the beginning stages of training when the training error is large and should improve convergence on error surfaces with local minima. The first-order term is a regularization term that can improve generalization. Specifically, it can encourage internal representations where the state nodes operate in the saturated regions of the sigmoid discriminant function. While this effect can improve performance on automata inference problems with binary inputs and target outputs, it is unclear what effect it will have on other types of problems. To substantiate these predictions, we present simulations on learning the dual parity grammar from temporal strings for all noise models, and present simulations on learning a randomly generated six-state grammar using the predicted best noise model.  相似文献   

8.
神经网络的两种结构优化算法研究   总被引:6,自引:0,他引:6  
提出了一种基于权值拟熵的“剪枝算法”与权值敏感度相结合的新方法,在“剪枝算法”中将权值拟熵作为惩罚项加入目标函数中,使多层前向神经网络在学习过程中自动约束权值分布,并以权值敏感度作为简化标准,避免了单纯依赖权值大小剪枝的随机性.同时,又针对剪枝算法在优化多输入多输出网络过程中计算量大、效率不高的问题,提出了一种在级联—相关(cascade correlation, CC)算法的基础上从适当的网络结构开始对网络进行构建的快速“构造算法”.仿真结果表明这种快速构造算法在收敛速度、运行效率乃至泛化性能上都更胜一筹.  相似文献   

9.
“剪枝算法”是一种通过简化神经网络结构来避免网络过拟合的有效方法之一。将权值拟熵作为惩罚项加入目标函数中,使多层前向神经网络在学习过程中自动约束权值分布,并以权值敏感度作为简化标准,避免了单纯依赖权值大小剪枝的随机性。由于在剪枝过程中只剪去数值小并且敏感度低的连接权,所以网络简化后不需要重新训练,算法效率明显提高。仿真结果证明上述方法算法简单易行,并且对前向神经网络的泛化能力有较好的改善作用。  相似文献   

10.
A new technique for facial expression recognition is proposed, which uses the two-dimensional (2D) discrete cosine transform (DCT) over the entire face image as a feature detector and a constructive one-hidden-layer feedforward neural network as a facial expression classifier. An input-side pruning technique, proposed previously by the authors, is also incorporated into the constructive learning process to reduce the network size without sacrificing the performance of the resulting network. The proposed technique is applied to a database consisting of images of 60 men, each having five facial expression images (neutral, smile, anger, sadness, and surprise). Images of 40 men are used for network training, and the remaining images of 20 men are used for generalization and testing. Confusion matrices calculated in both network training and generalization for four facial expressions (smile, anger, sadness, and surprise) are used to evaluate the performance of the trained network. It is demonstrated that the best recognition rates are 100% and 93.75% (without rejection), for the training and generalizing images, respectively. Furthermore, the input-side weights of the constructed network are reduced by approximately 30% using our pruning method. In comparison with the fixed structure back propagation-based recognition methods in the literature, the proposed technique constructs one-hidden-layer feedforward neural network with fewer number of hidden units and weights, while simultaneously provide improved generalization and recognition performance capabilities.  相似文献   

11.
Recurrent neural networks are prime candidates for learning evolutions in multi‐dimensional time series data. The performance of such a network is judged by the loss function, which is aggregated into a scalar value that decreases during training. Observing only this number hides the variation that occurs within the typically large training and testing data sets. Understanding these variations is of highest importance to adjust network hyper‐parameters, such as the number of neurons, number of layers or to adjust the training set to include more representative examples. In this paper, we design a comprehensive and interactive system that allows users to study the output of recurrent neural networks on both the complete training data and testing data. We follow a coarse‐to‐fine strategy, providing overviews of annual, monthly and daily patterns in the time series and directly support a comparison of different hyper‐parameter settings. We applied our method to a recurrent convolutional neural network that was trained and tested on 25 years of climate data to forecast meteorological attributes, such as temperature, pressure and wind velocity. We further visualize the quality of the forecasting models, when applied to various locations on the Earth and we examine the combination of several forecasting models.  相似文献   

12.
Bo   《Neurocomputing》2008,71(7-9):1527-1537
The performance of a simple recurrent neural network on the implicit acquisition of a context-free grammar is re-examined and found to be significantly higher than previously reported by Elman. This result is obtained although the previous work employed a multilayer extension of the basic form of simple recurrent network and restricted the complexity of training and test corpora. The high performance is traced to a well-organized internal representation of the grammatical elements, as probed by a principal-component analysis of the hidden-layer activities. From the next-symbol-prediction performance on sentences not present in the training corpus, a capacity of generalization is demonstrated.  相似文献   

13.
提升卷积神经网络的泛化能力和降低过拟合的风险是深度卷积神经网络的研究重点。遮挡是影响卷积神经网络泛化能力的关键因素之一,通常希望经过复杂训练得到的模型能够对遮挡图像有良好的泛化性。为了降低模型过拟合的风险和提升模型对随机遮挡图像识别的鲁棒性,提出了激活区域处理算法,在训练过程中对某一卷积层的最大激活特征图进行处理后对输入图像进行遮挡,然后将被遮挡的新图像作为网络的新输入并继续训练模型。实验结果表明,提出的算法能够提高多种卷积神经网络模型在不同数据集上的分类性能,并且训练好的模型对随机遮挡图像的识别具有非常好的鲁棒性。  相似文献   

14.
A formal selection and pruning technique based on the concept of local relative sensitivity index is proposed for feedforward neural networks. The mechanism of backpropagation training algorithm is revisited and the theoretical foundation of the improved selection and pruning technique is presented. This technique is based on parallel pruning of weights which are relatively redundant in a subgroup of a feedforward neural network. Comparative studies with a similar technique proposed in the literature show that the improved technique provides better pruning results in terms of reduction of model residues, improvement of generalization capability and reduction of network complexity. The effectiveness of the improved technique is demonstrated in developing neural network models of a number of nonlinear systems including three bit parity problem, Van der Pol equation, a chemical processes and two nonlinear discrete-time systems using the backpropagation training algorithm with adaptive learning rate.  相似文献   

15.
Using additive noise in back-propagation training   总被引:11,自引:0,他引:11  
The possibility of improving the generalization capability of a neural network by introducing additive noise to the training samples is discussed. The network considered is a feedforward layered neural network trained with the back-propagation algorithm. Back-propagation training is viewed as nonlinear least-squares regression and the additive noise is interpreted as generating a kernel estimate of the probability density that describes the training vector distribution. Two specific application types are considered: pattern classifier networks and estimation of a nonstochastic mapping from data corrupted by measurement errors. It is not proved that the introduction of additive noise to the training vectors always improves network generalization. However, the analysis suggests mathematically justified rules for choosing the characteristics of noise if additive noise is used in training. Results of mathematical statistics are used to establish various asymptotic consistency results for the proposed method. Numerical simulations support the applicability of the training method  相似文献   

16.

Bayesian neural networks (BNNs) are a promising method of obtaining statistical uncertainties for neural network predictions but with a higher computational overhead which can limit their practical usage. This work explores the use of high-performance computing with distributed training to address the challenges of training BNNs at scale. We present a performance and scalability comparison of training the VGG-16 and Resnet-18 models on a Cray-XC40 cluster. We demonstrate that network pruning can speed up inference without accuracy loss and provide an open-source software package, BPrune, to automate this pruning. For certain models we find that pruning up to 80% of the network results in only a 7.0% loss in accuracy. With the development of new hardware accelerators for deep learning, BNNs are of considerable interest for benchmarking performance. This analysis of training a BNN at scale outlines the limitations and benefits compared to a conventional neural network.

  相似文献   

17.
In this paper, we report results obtained with a Madaline neural network trained to classify inductive signatures of two vehicles classes: trucks with one rear axle and trucks with double rear axle. In order to train the Madaline, the inductive signatures were pre-processed and both classes, named C2 and C3, were subdivided into four subclasses. Thus, the initial classification task was split into four smaller tasks (theoretically) easier to be performed. The heuristic adopted in the training attempts to minimize the effects of the input space non-linearity on the classifier performance by uncoupling the learning of the classes and, for this, we induce output Adalines to specialize in learning one of the classes. The percentages of correct classifications presented concern patterns which were not submitted to the neural network in the training process, and, therefore, they indicate the neural network generalization ability. The results are good and stimulate the maintenance of this research on the use of Madaline networks in vehicle classification tasks using not linearly separable inductive signatures.  相似文献   

18.
We propose a new type of recurrent neural-network architecture, in which each output unit is connected to itself and is also fully connected to other output units and all hidden units. The proposed recurrent neural network differs from Jordan's and Elman's recurrent neural networks with respect to function and architecture, because it has been originally extended from being a mere multilayer feedforward neural network, to improve discrimination and generalization powers. We also prove the convergence properties of the learning algorithm in the proposed recurrent neural network, and analyze the performance of the proposed recurrent neural network by performing recognition experiments with the totally unconstrained handwritten numeric database of Concordia University, Montreal, Canada. Experimental results have confirmed that the proposed recurrent neural network improves discrimination and generalization powers in the recognition of visual patterns  相似文献   

19.
探究了基于卷积神经网络的句子级别的中文文本情感分类,模型以文本经过预处理后得到的词向量作为输入。传统的卷积神经网络是由线性卷积层、池化层和全连接层堆叠起来的,提出以跨通道卷积层替代传统线性卷积滤波器,对基本的卷积神经网络进行改进,提高网络的表达能力。实验表明,改进后的卷积神经网络在保证训练速度的情况下,识别率达到91.89%,优于传统的卷积神经网络,有较好的识别能力。  相似文献   

20.
神经网络在软件工程中的应用极大程度上缓解了传统的人工提取代码特征的压力.已有的研究往往将代码简化为自然语言或者依赖专家的领域知识来提取代码特征,简化为自然语言的处理方法过于简单,容易造成信息丢失,而引入专家制定启发式规则的模型往往过于复杂,可拓展性以及普适性不强.鉴于以上问题,提出了一种基于卷积和循环神经网络的自动代码...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号