首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Fast training of multilayer perceptrons   总被引:5,自引:0,他引:5  
Training a multilayer perceptron by an error backpropagation algorithm is slow and uncertain. This paper describes a new approach which is much faster and certain than error backpropagation. The proposed approach is based on combined iterative and direct solution methods. In this approach, we use an inverse transformation for linearization of nonlinear output activation functions, direct solution matrix methods for training the weights of the output layer; and gradient descent, the delta rule, and other proposed techniques for training the weights of the hidden layers. The approach has been implemented and tested on many problems. Experimental results, including training times and recognition accuracy, are given. Generally, the approach achieves accuracy as good as or better than perceptrons trained using error backpropagation, and the training process is much faster than the error backpropagation algorithm and also avoids local minima and paralysis.  相似文献   

2.
The attractive possibility of applying layerwise block training algorithms to multilayer perceptrons MLP, which offers initial advantages in computational effort, is refined in this article by means of introducing a sensitivity correction factor in the formulation. This results in a clear performance advantage, which we verify in several applications. The reasons for this advantage are discussed and related to implicit relations with second-order techniques, natural gradient formulations through Fisher's information matrix, and sample selection. Extensions to recurrent networks and other research lines are suggested at the close of the article.  相似文献   

3.
The use of multilayer perceptrons (MLP) with threshold functions (binary step function activations) greatly reduces the complexity of the hardware implementation of neural networks, provides tolerance to noise and improves the interpretation of the internal representations. In certain case, such as in learning stationary tasks, it may be sufficient to find appropriate weights for an MLP with threshold activation functions by software simulation and, then, transfer the weight values to the hardware implementation. Efficient training of these networks is a subject of considerable ongoing research. Methods available in the literature mainly focus on two-state (threshold) nodes and try to train the networks by approximating the gradient of the error function and modifying appropriately the gradient descent, or by progressively altering the shape of the activation functions. In this paper, we propose an evolution-motivated approach, which is eminently suitable for networks with threshold functions and compare its performance with four other methods. The proposed evolutionary strategy does not need gradient related information, it is applicable to a situation where threshold activations are used from the beginning of the training, as in “on-chip” training, and is able to train networks with integer weights.  相似文献   

4.
Fast parallel off-line training of multilayer perceptrons   总被引:2,自引:0,他引:2  
Various approaches to the parallel implementation of second-order gradient-based multilayer perceptron training algorithms are described. Two main classes of algorithm are defined involving Hessian and conjugate gradient-based methods. The limited- and full-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithms are selected as representative examples and used to show that the step size and gradient calculations are critical components. For larger problems the matrix calculations in the full-memory algorithm are also significant. Various strategies are considered for parallelization, the best of which is implemented on parallel virtual machine (PVM) and transputer-based architectures. Results from a range of problems are used to demonstrate the performance achievable with each architecture. The transputer implementation is found to give excellent speed-ups but the problem size is limited by memory constraints. The speed-ups achievable with the PVM implementation are much poorer because of inefficient communication, but memory is not a difficulty.  相似文献   

5.
In this paper a new multilayer perceptron (MLP) structure is introduced to simulate nonlinear transformations on infinite-dimensional function spaces. This extension is achieved by replacing discrete neurons by a continuum of neurons, summations by integrations and weight matrices by kernels of integral transforms. Variational techniques have been employed for the analysis and training of the infinite-dimensional MLP (IDMLP). The training problem of IDMLP is solved by the Lagrange multiplier technique yielding the coupled state and adjoint state integro-difference equations. A steepest descent-like algorithm is used to construct the required kernel and threshold functions. Finally, some results are presented to show the performance of the new IDMLP.  相似文献   

6.
The relationship between backpropagation and extended Kalman filtering for training multilayer perceptrons is examined. These two techniques are compared theoretically and empirically using sensor imagery. Backpropagation is a technique from neural networks for assigning weights in a multilayer perceptron. An extended Kalman filter can also be used for this purpose. A brief review of the multilayer perceptron and these two training methods is provided. Then, it is shown that backpropagation is a degenerate form of the extended Kalman filter. The training rules are compared in two examples: an image classification problem using laser radar Doppler imagery and a target detection problem using absolute range images. In both examples, the backpropagation training algorithm is shown to be three orders of magnitude less costly than the extended Kalman filter algorithm in terms of a number of floating-point operations  相似文献   

7.
This work concerns the selection of input-output pairs for improved training of multilayer perceptrons, in the context of approximation of univariate real functions. A criterion for the choice of the number of neurons in the hidden layer is also provided. The main idea is based on the fact that Chebyshev polynomials can provide approximations to bounded functions up to a prescribed tolerance, and, in turn, a polynomial of a certain order can be fitted with a three-layer perceptron with a prescribed number of hidden neurons. The results are applied to a sensor identification example.  相似文献   

8.
Changing the resolution of digital images and video is needed image processing systems. In this paper, we present nonlinear interpolation schemes for still image resolution enhancement. The proposed neural network interpolation method is based on wavelet reconstruction. With the wavelet decomposition, the image signals can be divided into several time–frequency portions. In this work, the wavelet decomposition signal is used to train the neural networks. The pixels in the low-resolution image are used as the input signal of the neural network to estimate all the wavelet sub-images of the corresponding high-resolution image. The image of increased resolution is finally produced by the synthesis procedure of wavelet transform. In the simulation, the proposed method obtains much better performance than other traditional methods. Moreover, the easy implementation and high flexibility of the proposed algorithm also make it applicable to various other related problems.  相似文献   

9.
Approximation by fully complex multilayer perceptrons   总被引:6,自引:0,他引:6  
Kim T  Adali T 《Neural computation》2003,15(7):1641-1666
We investigate the approximation ability of a multilayer perceptron (MLP) network when it is extended to the complex domain. The main challenge for processing complex data with neural networks has been the lack of bounded and analytic complex nonlinear activation functions in the complex domain, as stated by Liouville's theorem. To avoid the conflict between the boundedness and the analyticity of a nonlinear complex function in the complex domain, a number of ad hoc MLPs that include using two real-valued MLPs, one processing the real part and the other processing the imaginary part, have been traditionally employed. However, since nonanalytic functions do not meet the Cauchy-Riemann conditions, they render themselves into degenerative backpropagation algorithms that compromise the efficiency of nonlinear approximation and learning in the complex vector field. A number of elementary transcendental functions (ETFs) derivable from the entire exponential function e(z) that are analytic are defined as fully complex activation functions and are shown to provide a parsimonious structure for processing data in the complex domain and address most of the shortcomings of the traditional approach. The introduction of ETFs, however, raises a new question in the approximation capability of this fully complex MLP. In this letter, three proofs of the approximation capability of the fully complex MLP are provided based on the characteristics of singularity among ETFs. First, the fully complex MLPs with continuous ETFs over a compact set in the complex vector field are shown to be the universal approximator of any continuous complex mappings. The complex universal approximation theorem extends to bounded measurable ETFs possessing a removable singularity. Finally, it is shown that the output of complex MLPs using ETFs with isolated and essential singularities uniformly converges to any nonlinear mapping in the deleted annulus of singularity nearest to the origin.  相似文献   

10.
Statistical active learning in multilayer perceptrons   总被引:2,自引:0,他引:2  
Proposes methods for generating input locations actively in gathering training data, aiming at solving problems unique to muitilayer perceptrons. One of the problems is that optimum input locations, which are calculated deterministically, sometimes distribute densely around the same point and cause local minima in backpropagation training. Two probabilistic active learning methods, which utilize the statistical variance of locations, are proposed to solve this problem. One is parametric active learning and the other is multipoint-search active learning. Another serious problem in applying active learning to multilayer perceptrons is that a Fisher information matrix can be singular, while many methods, including the proposed ones, assume its regularity. A technique of pruning redundant hidden units is proposed to keep the Fisher information matrix regular. Combined with this technique, active learning can be applied stably to multilayer perceptrons. The effectiveness of the proposed methods is demonstrated through computer simulations on simple artificial problems and a real-world problem of color conversion.  相似文献   

11.
12.
Browne A 《Neural computation》2002,14(7):1739-1754
To give an adequate explanation of cognition and perform certain practical tasks, connectionist systems must be able to extrapolate. This work explores the relationship between input representation and extrapolation, using simulations of multilayer perceptrons trained to model the identity function. It has been discovered that representation has a marked effect on extrapolation.  相似文献   

13.
Although multilayer perceptrons have been shown to be adept at providing good solutions to many problems, they have a major drawback in the very large amount of time needed for training (for example, on the order of CPU days for some of the author's experiments). The paper describes a method of producing a reasonable starting point by using a nearest-neighbor classifier. The method is further expanded to provide a method of ;programming' the upper layer of any network assuming the lower layers already exist.  相似文献   

14.
The multilayer perceptrons (MLPs) have strange behaviors in the learning process caused by the existing singularities in the parameter space. A detailed theoretical or numerical analysis of the MLPs is difficult due to the non-integrability of the traditional log-sigmoid activation function which leads to difficulties in obtaining the averaged learning equations (ALEs). In this paper, the error function is suggested as the activation function of the MLPs. By solving the explicit expressions of two important expectations, we obtain the averaged learning equations which make it possible for further analysis of the learning dynamics in MLPs. The simulation results also indicate that the ALEs play a significant role in investigating the singular behaviors of MLPs.  相似文献   

15.
Discusses the ability of multilayer perceptrons (MLPs) to model the probability distribution of data in typical pattern recognition and verification problems. It is proven that multilayer perceptrons with sigmoidal units and a number of hidden units less or equal than the number of inputs are unable to model patterns distributed in typical clusters, since these networks draw open separation surfaces in the pattern space. When using more hidden units than inputs, the separation surfaces can be closed but, unfortunately it is proven that determining whether or not a MLP draws closed separation surfaces in the pattern space is NP-hard. The major conclusion of the paper is somewhat opposite to what is believed and reported in many application papers: MLPs are definitely not adequate for applications of pattern recognition requiring a reliable rejection and, especially, they are not adequate for pattern verification tasks  相似文献   

16.
On the initialization and optimization of multilayer perceptrons   总被引:1,自引:0,他引:1  
Multilayer perceptrons are now widely used for pattern recognition, although the training remains a time consuming procedure often converging toward a local optimum. Moreover, as the optimum network size and topology are usually unknown, the search of this optimum requires a lot of networks to be trained. In this paper the authors propose a method for properly initializing the parameters (weights) of a two-layer perceptron, and for identifying (without the need for any error-backpropagation training) the most suitable network size and topology for solving the problem under investigation. The initialized network can then be optimized by means of the standard error-backpropagation (EBP) algorithm. The authors' method is applicable to any two-layer perceptron comprising concentric as well as squashing units on its hidden layer. The output units are restricted to squashing units, but direct connections from the input to the output layer are also accommodated. To illustrate the power of the method, results obtained for different classification tasks are compared to similar results obtained using a traditional error-backpropagation training starting from a random initial state.  相似文献   

17.
Links between Markov models and multilayer perceptrons   总被引:2,自引:0,他引:2  
The statistical use of a particular classic form of a connectionist system, the multilayer perceptron (MLP), is described in the context of the recognition of continuous speech. A discriminant hidden Markov model (HMM) is defined, and it is shown how a particular MLP with contextual and extra feedback input units can be considered as a general form of such a Markov model. A link between these discriminant HMMs, trained along the Viterbi algorithm, and any other approach based on least mean square minimization of an error function (LMSE) is established. It is shown theoretically and experimentally that the outputs of the MLP (when trained along the LMSE or the entropy criterion) approximate the probability distribution over output classes conditioned on the input, i.e. the maximum a posteriori probabilities. Results of a series of speech recognition experiments are reported. The possibility of embedding MLP into HMM is described. Relations with other recurrent networks are also explained  相似文献   

18.
The natural gradient learning method is known to have ideal performances for on-line training of multilayer perceptrons. It avoids plateaus, which give rise to slow convergence of the backpropagation method. It is Fisher efficient, whereas the conventional method is not. However, for implementing the method, it is necessary to calculate the Fisher information matrix and its inverse, which is practically very difficult. This article proposes an adaptive method of directly obtaining the inverse of the Fisher information matrix. It generalizes the adaptive Gauss-Newton algorithms and provides a solid theoretical justification of them. Simulations show that the proposed adaptive method works very well for realizing natural gradient learning.  相似文献   

19.
The error backpropagation (EBP) training of a multilayer perceptron (MLP) may require a very large number of training epochs. Although the training time can usually be reduced considerably by adopting an on-line training paradigm, it can still be excessive when large networks have to be trained on lots of data. In this paper, a new on-line training algorithm is presented. It is called equalized EBP (EEBP), and it offers improved accuracy, speed, and robustness against badly scaled inputs. A major characteristic of EEBP is its utilization of weight specific learning rates whose relative magnitudes are derived from a priori computable properties of the network and the training data.  相似文献   

20.
Multilayer perceptrons (MLPs) with long- and short-term memories (LASTMs) are proposed for adaptive processing. The activation functions of the output neurons of such a network are linear, and thus the weights in the last layer affect the outputs of the network linearly and are called linear weights. These linear weights constitute the short-term memory and other weights the long-term memory. It is proven that virtually any function f(x, theta) with an environmental parameter theta can be approximated to any accuracy by an MLP with LASTMs whose long-term memory is independent of theta. This independency of theta allows the long-term memory to be determined in an a priori training and allows the online adjustment of only the short-term memory for adapting to the environmental parameter theta. The benefits of using an MLP with LASTMs include less online computation, no poor local extrema to fall into, and much more timely and better adaptation. Numerical examples illustrate that these benefits are realized satisfactorily.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号