共查询到20条相似文献,搜索用时 15 毫秒
1.
Computational capabilities of recurrent NARX neural networks 总被引:11,自引:0,他引:11
Siegelmann H.T. Horne B.G. Giles C.L. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》1997,27(2):208-215
Recently, fully connected recurrent neural networks have been proven to be computationally rich-at least as powerful as Turing machines. This work focuses on another network which is popular in control applications and has been found to be very effective at learning a variety of problems. These networks are based upon Nonlinear AutoRegressive models with eXogenous Inputs (NARX models), and are therefore called NARX networks. As opposed to other recurrent networks, NARX networks have a limited feedback which comes only from the output neuron rather than from hidden states. They are formalized by y(t)=Psi(u(t-n(u)), ..., u(t-1), u(t), y(t-n(y)), ..., y(t-1)) where u(t) and y(t) represent input and output of the network at time t, n(u) and n(y) are the input and output order, and the function Psi is the mapping performed by a Multilayer Perceptron. We constructively prove that the NARX networks with a finite number of parameters are computationally as strong as fully connected recurrent networks and thus Turing machines. We conclude that in theory one can use the NARX models, rather than conventional recurrent networks without any computational loss even though their feedback is limited. Furthermore, these results raise the issue of what amount of feedback or recurrence is necessary for any network to be Turing equivalent and what restrictions on feedback limit computational power. 相似文献
2.
Learning continuous trajectories in recurrent neural networks withtime-dependent weights 总被引:1,自引:0,他引:1
The paper is concerned with a general learning (with arbitrary criterion and state-dependent constraints) of continuous trajectories by means of recurrent neural networks with time-varying weights. The learning process is transformed into an optimal control framework, where the weights to be found are treated as controls. A learning algorithm based on a variational formulation of Pontryagin's maximum principle is proposed. This algorithm is shown to converge, under reasonable conditions, to an optimal solution. The neural networks with time-dependent weights make it possible to efficiently find an admissible solution (i.e., initial weights satisfying state constraints) which then serves as an initial guess to carry out a proper minimization of a given criterion. The proposed methodology may be directly applicable to both classification of temporal sequences and to optimal tracking of nonlinear dynamic systems. Numerical examples are also given which demonstrate the efficiency of the approach presented. 相似文献
3.
In this article, we propose a novel multivariate method for link prediction in evolving heterogeneous networks using a Nonlinear Autoregressive Neural Network with External Inputs (NARX). The proposed method combines (1) correlations between different link types; (2) the effects of different topological local and global similarity measures in different time periods; (3) nonlinear temporal evolution information; (4) the effects of the creation, preservation or removal of the links between the node pairs in consecutive time periods. We evaluate the performance of link prediction in terms of different AUC measures. Experiments on real networks demonstrate that the proposed multivariate method using NARX outperforms the previous temporal methods using univariate time series in different test cases. 相似文献
4.
This article studies the computational power of various discontinuous real computational models that are based on the classical analog recurrent neural network (ARNN). This ARNN consists of finite number of neurons; each neuron computes a polynomial net function and a sigmoid-like continuous activation function. We introduce arithmetic networks as ARNN augmented with a few simple discontinuous (e.g., threshold or zero test) neurons. We argue that even with weights restricted to polynomial time computable reals, arithmetic networks are able to compute arbitrarily complex recursive functions. We identify many types of neural networks that are at least as powerful as arithmetic nets, some of which are not in fact discontinuous, but they boost other arithmetic operations in the net function (e.g., neurons that can use divisions and polynomial net functions inside sigmoid-like continuous activation functions). These arithmetic networks are equivalent to the Blum-Shub-Smale model, when the latter is restricted to a bounded number of registers. With respect to implementation on digital computers, we show that arithmetic networks with rational weights can be simulated with exponential precision, but even with polynomial-time computable real weights, arithmetic networks are not subject to any fixed precision bounds. This is in contrast with the ARNN that are known to demand precision that is linear in the computation time. When nontrivial periodic functions (e.g., fractional part, sine, tangent) are added to arithmetic networks, the resulting networks are computationally equivalent to a massively parallel machine. Thus, these highly discontinuous networks can solve the presumably intractable class of PSPACE-complete problems in polynomial time. 相似文献
5.
Learning long-term dependencies with gradient descent is difficult 总被引:12,自引:0,他引:12
Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered. 相似文献
6.
循环神经网络是神经网络序列模型的主要实现形式,近几年得到迅速发展,其是机器翻译、机器问题回答、序列视频分析的标准处理手段,也是对于手写体自动合成、语音处理和图像生成等问题的主流建模手段.鉴于此,循环神经网络的各分支按照网络结构进行详细分类,大致分为3大类:一是衍生循环神经网络,这类网络是基于基本RNNs模型的结构衍生变体,即对RNNs的内部结构进行修改;二是组合循环神经网络,这类网络将其他一些经典的网络模型或结构与第一类衍生循环神经网络进行组合,得到更好的模型效果,是一种非常有效的手段;三是混合循环神经网络,这类网络模型既有不同网络模型的组合,又在RNNs内部结构上进行修改,是同属于前两类网络分类的结构.为了更加深入地理解循环神经网络,进一步介绍与循环神经网络经常混为一谈的递归神经网络结构以及递归神经网络与循环神经网络的区别和联系.在详略描述上述模型的应用背景、网络结构以及模型变种后,对各个模型的特点进行总结和比较,并对循环神经网络模型进行展望和总结. 相似文献
7.
递归神经网络的结构研究 总被引:8,自引:0,他引:8
从非线性动态系统的角度出发,对递归动态网络结构及其功能进行详尽的综述。将递归动态网络分为三大类:全局反馈递归网络、前向递归网络和混合型网络。每一类网络又可分为若干种网络。给出了每种网络描述网络特性的结构图,同时还对多种网络进行了功能对比,分析了各种网络的异同。 相似文献
8.
An iterative pruning method for second-order recurrent neural networks is presented. Each step consists in eliminating a unit
and adjusting the remaining weights so that the network performance does not worsen over the training set. The pruning process
involves solving a linear system of equations in the least-squares sense. The algorithm also provides a criterion for choosing
the units to be removed, which works well in practice. Initial experimental results demonstrate the effectiveness of the proposed
approach over high-order architectures. 相似文献
9.
A massively recurrent neural network responds on one side to input stimuli and is autonomously active, on the other side, in the absence of sensory inputs. Stimuli and information processing depend crucially on the quality of the autonomous-state dynamics of the ongoing neural activity. This default neural activity may be dynamically structured in time and space, showing regular, synchronized, bursting, or chaotic activity patterns. We study the influence of nonsynaptic plasticity on the default dynamical state of recurrent neural networks. The nonsynaptic adaption considered acts on intrinsic neural parameters, such as the threshold and the gain, and is driven by the optimization of the information entropy. We observe, in the presence of the intrinsic adaptation processes, three distinct and globally attracting dynamical regimes: a regular synchronized, an overall chaotic, and an intermittent bursting regime. The intermittent bursting regime is characterized by intervals of regular flows, which are quite insensitive to external stimuli, interceded by chaotic bursts that respond sensitively to input signals. We discuss these findings in the context of self-organized information processing and critical brain dynamics. 相似文献
10.
11.
This paper describes the application of artificial neural networks to acoustic-to-phonetic mapping. The experiments described are typical of problems in speech recognition in which the temporal nature of the input sequence is critical. The specific task considered is that of mapping formant contours to the corresponding CVC' syllable. We performed experiments on formant data extracted from the acoustic speech signal spoken at two different tempos (slow and normal) using networks based on the Elman simple recurrent network model. Our results show that the Elman networks used in these experiments were successful in performing the acoustic-to-phonetic mapping from formant contours. Consequently, we demonstrate that relatively simple networks, readily trained using standard backpropagation techniques, are capable of initial and final consonant discrimination and vowel identification for variable speech rates 相似文献
12.
Rule revision with recurrent neural networks 总被引:2,自引:0,他引:2
Recurrent neural networks readily process, recognize and generate temporal sequences. By encoding grammatical strings as temporal sequences, recurrent neural networks can be trained to behave like deterministic sequential finite-state automata. Algorithms have been developed for extracting grammatical rules from trained networks. Using a simple method for inserting prior knowledge (or rules) into recurrent neural networks, we show that recurrent neural networks are able to perform rule revision. Rule revision is performed by comparing the inserted rules with the rules in the finite-state automata extracted from trained networks. The results from training a recurrent neural network to recognize a known non-trivial, randomly-generated regular grammar show that not only do the networks preserve correct rules but that they are able to correct through training inserted rules which were initially incorrect (i.e. the rules were not the ones in the randomly generated grammar) 相似文献
13.
Rohitash ChandraAuthor Vitae Marcus FreanAuthor VitaeMengjie ZhangAuthor Vitae Christian W. OmlinAuthor Vitae 《Neurocomputing》2011,74(17):3223-3234
Cooperative coevolution employs evolutionary algorithms to solve a high-dimensional search problem by decomposing it into low-dimensional subcomponents. Efficient problem decomposition methods or encoding schemes group interacting variables into separate subcomponents in order to solve them separately where possible. It is important to find out which encoding schemes efficiently group subcomponents and the nature of the neural network training problem in terms of the degree of non-separability. This paper introduces a novel encoding scheme in cooperative coevolution for training recurrent neural networks. The method is tested on grammatical inference problems. The results show that the proposed encoding scheme achieves better performance when compared to a previous encoding scheme. 相似文献
14.
Quanjun WuAuthor Vitae Jin ZhouAuthor Vitae Lan XiangAuthor Vitae 《Neurocomputing》2011,74(17):3204-3211
The present paper formulates and studies a model of recurrent neural networks with time-varying delays in the presence of impulsive connectivity among the neurons. This model can well describe practical architectures of more realistic neural networks. Some novel yet generic criteria for global exponential stability of such neural networks are derived by establishing an extended Halanay differential inequality on impulsive delayed dynamical systems. The distinctive feature of this work is to address exponential stability issues without a priori stability assumption for the corresponding delayed neural networks without impulses. It is shown that the impulses in neuronal connectivity play an important role in inducing global exponential stability of recurrent delayed neural networks even if it may be unstable or chaotic itself. Furthermore, example and simulation are given to illustrate the practical nature of the novel results. 相似文献
15.
This paper studies the continuous attractors of discrete-time recurrent neural networks. Networks in discrete time can directly provide algorithms for efficient implementation in digital hardware. Continuous attractors of neural networks have been used to store and manipulate continuous stimuli for animals. A continuous attractor is defined as a connected set of stable equilibrium points. It forms a lower dimensional manifold in the original state space. Under some conditions, the complete analytical expressions for the continuous attractors of discrete-time linear recurrent neural networks as well as discrete-time linear-threshold recurrent neural networks are derived. Examples are employed to illustrate the theory. 相似文献
16.
Stability analysis of discrete-time recurrent neural networks 总被引:10,自引:0,他引:10
We address the problem of global Lyapunov stability of discrete-time recurrent neural networks (RNNs) in the unforced (unperturbed) setting. It is assumed that network weights are fixed to some values, for example, those attained after training. Based on classical results of the theory of absolute stability, we propose a new approach for the stability analysis of RNNs with sector-type monotone nonlinearities and nonzero biases. We devise a simple state-space transformation to convert the original RNN equations to a form suitable for our stability analysis. We then present appropriate linear matrix inequalities (LMIs) to be solved to determine whether the system under study is globally exponentially stable. Unlike previous treatments, our approach readily permits one to account for non-zero biases usually present in RNNs for improved approximation capabilities. We show how recent results of others on the stability analysis of RNNs can be interpreted as special cases within our approach. We illustrate how to use our approach with examples. Though illustrated on the stability analysis of recurrent multilayer perceptrons, the approach proposed can also be applied to other forms of time-lagged RNNs. 相似文献
17.
LiMin Fu 《Neural Networks, IEEE Transactions on》1998,9(1):151-158
The computational framework of rule-based neural networks inherits from the neural network and the inference engine of an expert system. In one approach, the network activation function is based on the certainty factor (CF) model of MYCIN-like systems. In this paper, it is shown theoretically that the neural network using the CF-based activation function requires relatively small sample sizes for correct generalization. This result is also confirmed by empirical studies in several independent domains. 相似文献
18.
This paper studies controllability for the class of control systems commonly called (continuous-time) recurrent neural networks. It is shown that, under a generic condition on the input matrix, the system is controllable, for every possible state matrix. The result holds when the activation function is the hyperbolic tangent. 相似文献
19.
结合实例,给出了递归神经网络的完整设计步骤,包括网络结构的选定,学习算法的选择和网络参数的训练过程。重点研究了学习速率的初始值选取及其调整顺序。给出的递归网络的设计方法,可以适用于多种递归神经网络。 相似文献
20.
Markovian architectural bias of recurrent neural networks 总被引:5,自引:0,他引:5
In this paper, we elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training. By concentrating on activation clusters in RNNs, while not throwing away the continuous state space network dynamics, we extract predictive models that we call neural prediction machines (NPMs). When RNNs with sigmoid activation functions are initialized with small weights (a common technique in the RNN community), the clusters of recurrent activations emerging prior to training are indeed meaningful and correspond to Markov prediction contexts. In this case, the extracted NPMs correspond to a class of Markov models, called variable memory length Markov models (VLMMs). In order to appreciate how much information has really been induced during the training, the RNN performance should always be compared with that of VLMMs and NPMs extracted before training as the "" base models. Our arguments are supported by experiments on a chaotic symbolic sequence and a context-free language with a deep recursive structure. 相似文献