首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Statistical active learning in multilayer perceptrons   总被引:2,自引:0,他引:2  
Proposes methods for generating input locations actively in gathering training data, aiming at solving problems unique to muitilayer perceptrons. One of the problems is that optimum input locations, which are calculated deterministically, sometimes distribute densely around the same point and cause local minima in backpropagation training. Two probabilistic active learning methods, which utilize the statistical variance of locations, are proposed to solve this problem. One is parametric active learning and the other is multipoint-search active learning. Another serious problem in applying active learning to multilayer perceptrons is that a Fisher information matrix can be singular, while many methods, including the proposed ones, assume its regularity. A technique of pruning redundant hidden units is proposed to keep the Fisher information matrix regular. Combined with this technique, active learning can be applied stably to multilayer perceptrons. The effectiveness of the proposed methods is demonstrated through computer simulations on simple artificial problems and a real-world problem of color conversion.  相似文献   

2.
The multilayer perceptrons (MLPs) have strange behaviors in the learning process caused by the existing singularities in the parameter space. A detailed theoretical or numerical analysis of the MLPs is difficult due to the non-integrability of the traditional log-sigmoid activation function which leads to difficulties in obtaining the averaged learning equations (ALEs). In this paper, the error function is suggested as the activation function of the MLPs. By solving the explicit expressions of two important expectations, we obtain the averaged learning equations which make it possible for further analysis of the learning dynamics in MLPs. The simulation results also indicate that the ALEs play a significant role in investigating the singular behaviors of MLPs.  相似文献   

3.
Multilayer perceptrons (MLPs) with long- and short-term memories (LASTMs) are proposed for adaptive processing. The activation functions of the output neurons of such a network are linear, and thus the weights in the last layer affect the outputs of the network linearly and are called linear weights. These linear weights constitute the short-term memory and other weights the long-term memory. It is proven that virtually any function f(x, theta) with an environmental parameter theta can be approximated to any accuracy by an MLP with LASTMs whose long-term memory is independent of theta. This independency of theta allows the long-term memory to be determined in an a priori training and allows the online adjustment of only the short-term memory for adapting to the environmental parameter theta. The benefits of using an MLP with LASTMs include less online computation, no poor local extrema to fall into, and much more timely and better adaptation. Numerical examples illustrate that these benefits are realized satisfactorily.  相似文献   

4.
Fast training of multilayer perceptrons   总被引:5,自引:0,他引:5  
Training a multilayer perceptron by an error backpropagation algorithm is slow and uncertain. This paper describes a new approach which is much faster and certain than error backpropagation. The proposed approach is based on combined iterative and direct solution methods. In this approach, we use an inverse transformation for linearization of nonlinear output activation functions, direct solution matrix methods for training the weights of the output layer; and gradient descent, the delta rule, and other proposed techniques for training the weights of the hidden layers. The approach has been implemented and tested on many problems. Experimental results, including training times and recognition accuracy, are given. Generally, the approach achieves accuracy as good as or better than perceptrons trained using error backpropagation, and the training process is much faster than the error backpropagation algorithm and also avoids local minima and paralysis.  相似文献   

5.
Multilayer perceptrons are successfully used in an increasing number of nonlinear signal processing applications. The backpropagation learning algorithm, or variations hereof, is the standard method applied to the nonlinear optimization problem of adjusting the weights in the network in order to minimize a given cost function. However, backpropagation as a steepest descent approach is too slow for many applications. In this paper a new learning procedure is presented which is based on a linearization of the nonlinear processing elements and the optimization of the multilayer perceptron layer by layer. In order to limit the introduced linearization error a penalty term is added to the cost function. The new learning algorithm is applied to the problem of nonlinear prediction of chaotic time series. The proposed algorithm yields results in both accuracy and convergence rates which are orders of magnitude superior compared to conventional backpropagation learning.  相似文献   

6.
多层感知器神经网络(MLPs)的学习过程经常发生一些奇异性行为,容易陷入平坦区,这都和MLPs的参数空间中存在的奇异性区域有直接关系.当MLPs的两个隐节点的权值接近互反时,置换对称性会导致学习困难.对MLPs的互反奇异性区域附近的学习动态进行分析.本文首先得到了平均学习方程的解析表达式,然后给出了互反奇异性区域附近的理论学习轨迹,并通过数值方法得到了其附近的实际学习轨迹.通过仿真实验,分别观察了MLPs的平均学习动态,批处理学习动态和在线学习动态,并进行了比较分析.  相似文献   

7.
The attractive possibility of applying layerwise block training algorithms to multilayer perceptrons MLP, which offers initial advantages in computational effort, is refined in this article by means of introducing a sensitivity correction factor in the formulation. This results in a clear performance advantage, which we verify in several applications. The reasons for this advantage are discussed and related to implicit relations with second-order techniques, natural gradient formulations through Fisher's information matrix, and sample selection. Extensions to recurrent networks and other research lines are suggested at the close of the article.  相似文献   

8.
We present deterministic nonmonotone learning strategies for multilayer perceptrons (MLPs), i.e., deterministic training algorithms in which error function values are allowed to increase at some epochs. To this end, we argue that the current error function value must satisfy a nonmonotone criterion with respect to the maximum error function value of the M previous epochs, and we propose a subprocedure to dynamically compute M. The nonmonotone strategy can be incorporated in any batch training algorithm and provides fast, stable, and reliable learning. Experimental results in different classes of problems show that this approach improves the convergence speed and success percentage of first-order training algorithms and alleviates the need for fine-tuning problem-depended heuristic parameters.  相似文献   

9.
Discusses the ability of multilayer perceptrons (MLPs) to model the probability distribution of data in typical pattern recognition and verification problems. It is proven that multilayer perceptrons with sigmoidal units and a number of hidden units less or equal than the number of inputs are unable to model patterns distributed in typical clusters, since these networks draw open separation surfaces in the pattern space. When using more hidden units than inputs, the separation surfaces can be closed but, unfortunately it is proven that determining whether or not a MLP draws closed separation surfaces in the pattern space is NP-hard. The major conclusion of the paper is somewhat opposite to what is believed and reported in many application papers: MLPs are definitely not adequate for applications of pattern recognition requiring a reliable rejection and, especially, they are not adequate for pattern verification tasks  相似文献   

10.
The use of multilayer perceptrons (MLP) with threshold functions (binary step function activations) greatly reduces the complexity of the hardware implementation of neural networks, provides tolerance to noise and improves the interpretation of the internal representations. In certain case, such as in learning stationary tasks, it may be sufficient to find appropriate weights for an MLP with threshold activation functions by software simulation and, then, transfer the weight values to the hardware implementation. Efficient training of these networks is a subject of considerable ongoing research. Methods available in the literature mainly focus on two-state (threshold) nodes and try to train the networks by approximating the gradient of the error function and modifying appropriately the gradient descent, or by progressively altering the shape of the activation functions. In this paper, we propose an evolution-motivated approach, which is eminently suitable for networks with threshold functions and compare its performance with four other methods. The proposed evolutionary strategy does not need gradient related information, it is applicable to a situation where threshold activations are used from the beginning of the training, as in “on-chip” training, and is able to train networks with integer weights.  相似文献   

11.
Fast parallel off-line training of multilayer perceptrons   总被引:2,自引:0,他引:2  
Various approaches to the parallel implementation of second-order gradient-based multilayer perceptron training algorithms are described. Two main classes of algorithm are defined involving Hessian and conjugate gradient-based methods. The limited- and full-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithms are selected as representative examples and used to show that the step size and gradient calculations are critical components. For larger problems the matrix calculations in the full-memory algorithm are also significant. Various strategies are considered for parallelization, the best of which is implemented on parallel virtual machine (PVM) and transputer-based architectures. Results from a range of problems are used to demonstrate the performance achievable with each architecture. The transputer implementation is found to give excellent speed-ups but the problem size is limited by memory constraints. The speed-ups achievable with the PVM implementation are much poorer because of inefficient communication, but memory is not a difficulty.  相似文献   

12.
On the initialization and optimization of multilayer perceptrons   总被引:1,自引:0,他引:1  
Multilayer perceptrons are now widely used for pattern recognition, although the training remains a time consuming procedure often converging toward a local optimum. Moreover, as the optimum network size and topology are usually unknown, the search of this optimum requires a lot of networks to be trained. In this paper the authors propose a method for properly initializing the parameters (weights) of a two-layer perceptron, and for identifying (without the need for any error-backpropagation training) the most suitable network size and topology for solving the problem under investigation. The initialized network can then be optimized by means of the standard error-backpropagation (EBP) algorithm. The authors' method is applicable to any two-layer perceptron comprising concentric as well as squashing units on its hidden layer. The output units are restricted to squashing units, but direct connections from the input to the output layer are also accommodated. To illustrate the power of the method, results obtained for different classification tasks are compared to similar results obtained using a traditional error-backpropagation training starting from a random initial state.  相似文献   

13.
We propose an adaptive improved natural gradient algorithm for blind separation of independent sources. First, inspired by the well-known backpropagation algorithm, we incorporate a momentum term into the natural gradient learning process to accelerate the convergence rate and improve the stability. Then an estimation function for the adaptation of the separation model is obtained to adaptively control a step-size parameter and a momentum factor. The proposed natural gradient algorithm with variable step-size parameter and variable momentum factor is therefore particularly well suited to blind source separation in a time-varying environment, such as an abruptly changing mixing matrix or signal power. The expected improvement in the convergence speed, stability, and tracking ability of the proposed algorithm is demonstrated by extensive simulation results in both time-invariant and time-varying environments. The ability of the proposed algorithm to separate extremely weak or badly scaled sources is also verified. In addition, simulation results show that the proposed algorithm is suitable for separating mixtures of many sources (e.g., the number of sources is 10) in the complete case.  相似文献   

14.
The error backpropagation (EBP) training of a multilayer perceptron (MLP) may require a very large number of training epochs. Although the training time can usually be reduced considerably by adopting an on-line training paradigm, it can still be excessive when large networks have to be trained on lots of data. In this paper, a new on-line training algorithm is presented. It is called equalized EBP (EEBP), and it offers improved accuracy, speed, and robustness against badly scaled inputs. A major characteristic of EEBP is its utilization of weight specific learning rates whose relative magnitudes are derived from a priori computable properties of the network and the training data.  相似文献   

15.
Multilayer perceptrons trained with the backpropagation algorithm are tested in detection and classification tasks and are compared to optimal algorithms resulting from likelihood ratio tests. The focus is on the problem of one of M orthogonal signals in a Gaussian noise environment, since both the Bayesian detector and classifier are known for this problem and can provide a measure for the performance evaluation of the neural networks. Two basic situations are considered: detection and classification. For the detection part, it was observed that for the signal-known-exactly case (M=1), the performance of the neural detector converges to the performance of the ideal Bayesian decision processor, while for a higher degree of uncertainty (i.e. for a larger M), the performance of the multilayer perceptron is inferior to that of the optimal detector. For the classification case, the probability of error of the neural network is comparable to the minimum Bayesian error, which can be numerically calculated. Adding noise during the training stage of the network does not affect the performance of the neural detector; however, there is an indication that the presence of noise in the learning process of the neural classifier results in a degraded classification performance.  相似文献   

16.
This paper studies the classification mechanisms of multilayer perceptrons (MLPs) with sigmoid activation functions (SAFs). The viewpoint is presented that in the input space the hyperplanes determined by the hidden basis functions with values 0's do not play the role of decision boundaries, and such hyperplanes do not certainly go through the marginal regions between different classes. For solving an n-class problem, a single-hidden-layer perceptron with at least log2(n-1)?2 hidden nodes is needed. The final number of hidden neurons is still related to the sample distribution shapes and regions, but not to the number of samples and input dimensions. As a result, an empirical formula for optimally selecting the initial number of hidden nodes is proposed. The ranks of response matrixes of hidden layers should be taken as a main basis for pruning or growing the existing hidden neurons. A structure-fixed perceptron ought to learn more than one round from different starting weight points for one classification task, and only the group of weights and biases that has the best generalization performance should be reserved. Finally, three examples are given to verify the above viewpoints.  相似文献   

17.
递归多层感知器的稳定性分析——LMI方法   总被引:3,自引:0,他引:3       下载免费PDF全文
递归多层感知器(RMLP)在工程上应用比较多,但对其稳定性的研究还比较少.本文提出一种新的神经网络模型———标准神经网络模型(SNNM),通过状态空间扩展法,将RMLP转化为SNNM,而SNNM的稳定性分析可转化为一组线性矩阵不等式(LMI)的求解,利用Matlab/LMIToolbox求解LMI,从而判定RMLP的Lyapunov稳定性,并考虑非零阈值对稳定性的影响.该方法也适用于其他类型的递归神经网络(RNN)的稳定性分析.  相似文献   

18.
The relationship between backpropagation and extended Kalman filtering for training multilayer perceptrons is examined. These two techniques are compared theoretically and empirically using sensor imagery. Backpropagation is a technique from neural networks for assigning weights in a multilayer perceptron. An extended Kalman filter can also be used for this purpose. A brief review of the multilayer perceptron and these two training methods is provided. Then, it is shown that backpropagation is a degenerate form of the extended Kalman filter. The training rules are compared in two examples: an image classification problem using laser radar Doppler imagery and a target detection problem using absolute range images. In both examples, the backpropagation training algorithm is shown to be three orders of magnitude less costly than the extended Kalman filter algorithm in terms of a number of floating-point operations  相似文献   

19.
The dilation-erosion-linear perceptron is a hybrid morphological neuron which has been recently proposed in the literature to solve some prediction problems. However, a drawback arises from such model for building mappings to solve tasks with complex input-output nonlinear relationships within effort estimation problems. In this sense, to overcome this limitation, we present a particular class of hybrid multilayer perceptrons, called the multilayer dilation-erosion-linear perceptron (MDELP), to deal with software development effort estimation problems. Each processing unit of the proposed model is composed of a mix between a hybrid morphological operator (given by a balanced combination between dilation and erosion operators) and a linear operator. According to Pessoa and Maragos’s ideas, we propose a descending gradient-based learning process to train the proposed model. Besides, we conduct an experimental analysis using relevant datasets of software development effort estimation and the achieved results are discussed and compared, according to MMRE and PRED25 measures, to those obtained by classical and state of the art models presented in the literature.  相似文献   

20.
Nonlinear transformation is one of the major obstacles to analyzing the properties of multilayer perceptrons. In this letter, we prove that the correlation coefficient between two jointly Gaussian random variables decreases when each of them is transformed under continuous nonlinear transformations, which can be approximated by piecewise linear functions. When the inputs or the weights of a multilayer perceptron are perturbed randomly, the weighted sums to the hidden neurons are asymptotically jointly Gaussian random variables. Since sigmoidal transformation can be approximated piecewise linearly, the correlations among the weighted sums decrease under sigmoidal transformations. Based on this result, we can say that sigmoidal transformation used as the transfer function of the multilayer perceptron reduces redundancy in the information contents of the hidden neurons.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号