共查询到20条相似文献,搜索用时 15 毫秒
1.
Statistical active learning in multilayer perceptrons 总被引:2,自引:0,他引:2
Proposes methods for generating input locations actively in gathering training data, aiming at solving problems unique to muitilayer perceptrons. One of the problems is that optimum input locations, which are calculated deterministically, sometimes distribute densely around the same point and cause local minima in backpropagation training. Two probabilistic active learning methods, which utilize the statistical variance of locations, are proposed to solve this problem. One is parametric active learning and the other is multipoint-search active learning. Another serious problem in applying active learning to multilayer perceptrons is that a Fisher information matrix can be singular, while many methods, including the proposed ones, assume its regularity. A technique of pruning redundant hidden units is proposed to keep the Fisher information matrix regular. Combined with this technique, active learning can be applied stably to multilayer perceptrons. The effectiveness of the proposed methods is demonstrated through computer simulations on simple artificial problems and a real-world problem of color conversion. 相似文献
2.
Weili Guo Haikun Wei Junsheng Zhao Kanjian Zhang 《Neural computing & applications》2014,25(3-4):825-832
The multilayer perceptrons (MLPs) have strange behaviors in the learning process caused by the existing singularities in the parameter space. A detailed theoretical or numerical analysis of the MLPs is difficult due to the non-integrability of the traditional log-sigmoid activation function which leads to difficulties in obtaining the averaged learning equations (ALEs). In this paper, the error function is suggested as the activation function of the MLPs. By solving the explicit expressions of two important expectations, we obtain the averaged learning equations which make it possible for further analysis of the learning dynamics in MLPs. The simulation results also indicate that the ALEs play a significant role in investigating the singular behaviors of MLPs. 相似文献
3.
Multilayer perceptrons (MLPs) with long- and short-term memories (LASTMs) are proposed for adaptive processing. The activation functions of the output neurons of such a network are linear, and thus the weights in the last layer affect the outputs of the network linearly and are called linear weights. These linear weights constitute the short-term memory and other weights the long-term memory. It is proven that virtually any function f(x, theta) with an environmental parameter theta can be approximated to any accuracy by an MLP with LASTMs whose long-term memory is independent of theta. This independency of theta allows the long-term memory to be determined in an a priori training and allows the online adjustment of only the short-term memory for adapting to the environmental parameter theta. The benefits of using an MLP with LASTMs include less online computation, no poor local extrema to fall into, and much more timely and better adaptation. Numerical examples illustrate that these benefits are realized satisfactorily. 相似文献
4.
An accelerated learning algorithm for multilayer perceptrons: optimization layer by layer. 总被引:14,自引:0,他引:14
Multilayer perceptrons are successfully used in an increasing number of nonlinear signal processing applications. The backpropagation learning algorithm, or variations hereof, is the standard method applied to the nonlinear optimization problem of adjusting the weights in the network in order to minimize a given cost function. However, backpropagation as a steepest descent approach is too slow for many applications. In this paper a new learning procedure is presented which is based on a linearization of the nonlinear processing elements and the optimization of the multilayer perceptron layer by layer. In order to limit the introduced linearization error a penalty term is added to the cost function. The new learning algorithm is applied to the problem of nonlinear prediction of chaotic time series. The proposed algorithm yields results in both accuracy and convergence rates which are orders of magnitude superior compared to conventional backpropagation learning. 相似文献
5.
Fast training of multilayer perceptrons 总被引:5,自引:0,他引:5
Training a multilayer perceptron by an error backpropagation algorithm is slow and uncertain. This paper describes a new approach which is much faster and certain than error backpropagation. The proposed approach is based on combined iterative and direct solution methods. In this approach, we use an inverse transformation for linearization of nonlinear output activation functions, direct solution matrix methods for training the weights of the output layer; and gradient descent, the delta rule, and other proposed techniques for training the weights of the hidden layers. The approach has been implemented and tested on many problems. Experimental results, including training times and recognition accuracy, are given. Generally, the approach achieves accuracy as good as or better than perceptrons trained using error backpropagation, and the training process is much faster than the error backpropagation algorithm and also avoids local minima and paralysis. 相似文献
6.
The attractive possibility of applying layerwise block training algorithms to multilayer perceptrons MLP, which offers initial advantages in computational effort, is refined in this article by means of introducing a sensitivity correction factor in the formulation. This results in a clear performance advantage, which we verify in several applications. The reasons for this advantage are discussed and related to implicit relations with second-order techniques, natural gradient formulations through Fisher's information matrix, and sample selection. Extensions to recurrent networks and other research lines are suggested at the close of the article. 相似文献
7.
Plagianakos V.P. Magoulas G.D. Vrahatis M.N. 《Neural Networks, IEEE Transactions on》2002,13(6):1268-1284
We present deterministic nonmonotone learning strategies for multilayer perceptrons (MLPs), i.e., deterministic training algorithms in which error function values are allowed to increase at some epochs. To this end, we argue that the current error function value must satisfy a nonmonotone criterion with respect to the maximum error function value of the M previous epochs, and we propose a subprocedure to dynamically compute M. The nonmonotone strategy can be incorporated in any batch training algorithm and provides fast, stable, and reliable learning. Experimental results in different classes of problems show that this approach improves the convergence speed and success percentage of first-order training algorithms and alleviates the need for fine-tuning problem-depended heuristic parameters. 相似文献
8.
Gori M. Scarselli F. 《IEEE transactions on pattern analysis and machine intelligence》1998,20(11):1121-1132
Discusses the ability of multilayer perceptrons (MLPs) to model the probability distribution of data in typical pattern recognition and verification problems. It is proven that multilayer perceptrons with sigmoidal units and a number of hidden units less or equal than the number of inputs are unable to model patterns distributed in typical clusters, since these networks draw open separation surfaces in the pattern space. When using more hidden units than inputs, the separation surfaces can be closed but, unfortunately it is proven that determining whether or not a MLP draws closed separation surfaces in the pattern space is NP-hard. The major conclusion of the paper is somewhat opposite to what is believed and reported in many application papers: MLPs are definitely not adequate for applications of pattern recognition requiring a reliable rejection and, especially, they are not adequate for pattern verification tasks 相似文献
9.
The use of multilayer perceptrons (MLP) with threshold functions (binary step function activations) greatly reduces the complexity of the hardware implementation of neural networks, provides tolerance to noise and improves the interpretation of the internal representations. In certain case, such as in learning stationary tasks, it may be sufficient to find appropriate weights for an MLP with threshold activation functions by software simulation and, then, transfer the weight values to the hardware implementation. Efficient training of these networks is a subject of considerable ongoing research. Methods available in the literature mainly focus on two-state (threshold) nodes and try to train the networks by approximating the gradient of the error function and modifying appropriately the gradient descent, or by progressively altering the shape of the activation functions. In this paper, we propose an evolution-motivated approach, which is eminently suitable for networks with threshold functions and compare its performance with four other methods. The proposed evolutionary strategy does not need gradient related information, it is applicable to a situation where threshold activations are used from the beginning of the training, as in “on-chip” training, and is able to train networks with integer weights. 相似文献
10.
Fast parallel off-line training of multilayer perceptrons 总被引:2,自引:0,他引:2
Various approaches to the parallel implementation of second-order gradient-based multilayer perceptron training algorithms are described. Two main classes of algorithm are defined involving Hessian and conjugate gradient-based methods. The limited- and full-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithms are selected as representative examples and used to show that the step size and gradient calculations are critical components. For larger problems the matrix calculations in the full-memory algorithm are also significant. Various strategies are considered for parallelization, the best of which is implemented on parallel virtual machine (PVM) and transputer-based architectures. Results from a range of problems are used to demonstrate the performance achievable with each architecture. The transputer implementation is found to give excellent speed-ups but the problem size is limited by memory constraints. The speed-ups achievable with the PVM implementation are much poorer because of inefficient communication, but memory is not a difficulty. 相似文献
11.
On the initialization and optimization of multilayer perceptrons 总被引:1,自引:0,他引:1
Multilayer perceptrons are now widely used for pattern recognition, although the training remains a time consuming procedure often converging toward a local optimum. Moreover, as the optimum network size and topology are usually unknown, the search of this optimum requires a lot of networks to be trained. In this paper the authors propose a method for properly initializing the parameters (weights) of a two-layer perceptron, and for identifying (without the need for any error-backpropagation training) the most suitable network size and topology for solving the problem under investigation. The initialized network can then be optimized by means of the standard error-backpropagation (EBP) algorithm. The authors' method is applicable to any two-layer perceptron comprising concentric as well as squashing units on its hidden layer. The output units are restricted to squashing units, but direct connections from the input to the output layer are also accommodated. To illustrate the power of the method, results obtained for different classification tasks are compared to similar results obtained using a traditional error-backpropagation training starting from a random initial state. 相似文献
12.
We propose an adaptive improved natural gradient algorithm for blind separation of independent sources. First, inspired by the well-known backpropagation algorithm, we incorporate a momentum term into the natural gradient learning process to accelerate the convergence rate and improve the stability. Then an estimation function for the adaptation of the separation model is obtained to adaptively control a step-size parameter and a momentum factor. The proposed natural gradient algorithm with variable step-size parameter and variable momentum factor is therefore particularly well suited to blind source separation in a time-varying environment, such as an abruptly changing mixing matrix or signal power. The expected improvement in the convergence speed, stability, and tracking ability of the proposed algorithm is demonstrated by extensive simulation results in both time-invariant and time-varying environments. The ability of the proposed algorithm to separate extremely weak or badly scaled sources is also verified. In addition, simulation results show that the proposed algorithm is suitable for separating mixtures of many sources (e.g., the number of sources is 10) in the complete case. 相似文献
13.
An equalized error backpropagation algorithm for the on-linetraining of multilayer perceptrons 总被引:1,自引:0,他引:1
The error backpropagation (EBP) training of a multilayer perceptron (MLP) may require a very large number of training epochs. Although the training time can usually be reduced considerably by adopting an on-line training paradigm, it can still be excessive when large networks have to be trained on lots of data. In this paper, a new on-line training algorithm is presented. It is called equalized EBP (EEBP), and it offers improved accuracy, speed, and robustness against badly scaled inputs. A major characteristic of EEBP is its utilization of weight specific learning rates whose relative magnitudes are derived from a priori computable properties of the network and the training data. 相似文献
14.
Michalopoulou Z.H. Nolte L.W. Alexandrou D. 《Neural Networks, IEEE Transactions on》1995,6(2):381-386
Multilayer perceptrons trained with the backpropagation algorithm are tested in detection and classification tasks and are compared to optimal algorithms resulting from likelihood ratio tests. The focus is on the problem of one of M orthogonal signals in a Gaussian noise environment, since both the Bayesian detector and classifier are known for this problem and can provide a measure for the performance evaluation of the neural networks. Two basic situations are considered: detection and classification. For the detection part, it was observed that for the signal-known-exactly case (M=1), the performance of the neural detector converges to the performance of the ideal Bayesian decision processor, while for a higher degree of uncertainty (i.e. for a larger M), the performance of the multilayer perceptron is inferior to that of the optimal detector. For the classification case, the probability of error of the neural network is comparable to the minimum Bayesian error, which can be numerically calculated. Adding noise during the training stage of the network does not affect the performance of the neural detector; however, there is an indication that the presence of noise in the learning process of the neural classifier results in a degraded classification performance. 相似文献
15.
This paper studies the classification mechanisms of multilayer perceptrons (MLPs) with sigmoid activation functions (SAFs). The viewpoint is presented that in the input space the hyperplanes determined by the hidden basis functions with values 0's do not play the role of decision boundaries, and such hyperplanes do not certainly go through the marginal regions between different classes. For solving an n-class problem, a single-hidden-layer perceptron with at least log2(n-1)?2 hidden nodes is needed. The final number of hidden neurons is still related to the sample distribution shapes and regions, but not to the number of samples and input dimensions. As a result, an empirical formula for optimally selecting the initial number of hidden nodes is proposed. The ranks of response matrixes of hidden layers should be taken as a main basis for pruning or growing the existing hidden neurons. A structure-fixed perceptron ought to learn more than one round from different starting weight points for one classification task, and only the group of weights and biases that has the best generalization performance should be reserved. Finally, three examples are given to verify the above viewpoints. 相似文献
16.
Ruck D.W. Rogers S.K. Kabrisky M. Maybeck P.S. Oxley M.E. 《IEEE transactions on pattern analysis and machine intelligence》1992,14(6):686-691
The relationship between backpropagation and extended Kalman filtering for training multilayer perceptrons is examined. These two techniques are compared theoretically and empirically using sensor imagery. Backpropagation is a technique from neural networks for assigning weights in a multilayer perceptron. An extended Kalman filter can also be used for this purpose. A brief review of the multilayer perceptron and these two training methods is provided. Then, it is shown that backpropagation is a degenerate form of the extended Kalman filter. The training rules are compared in two examples: an image classification problem using laser radar Doppler imagery and a target detection problem using absolute range images. In both examples, the backpropagation training algorithm is shown to be three orders of magnitude less costly than the extended Kalman filter algorithm in terms of a number of floating-point operations 相似文献
17.
Linear-least-squares initialization of multilayer perceptrons through backpropagation of the desired response 总被引:1,自引:0,他引:1
Erdogmus D. Fontenla-Romero O. Principe J.C. Alonso-Betanzos A. Castillo E. 《Neural Networks, IEEE Transactions on》2005,16(2):325-337
Training multilayer neural networks is typically carried out using descent techniques such as the gradient-based backpropagation (BP) of error or the quasi-Newton approaches including the Levenberg-Marquardt algorithm. This is basically due to the fact that there are no analytical methods to find the optimal weights, so iterative local or global optimization techniques are necessary. The success of iterative optimization procedures is strictly dependent on the initial conditions, therefore, in this paper, we devise a principled novel method of backpropagating the desired response through the layers of a multilayer perceptron (MLP), which enables us to accurately initialize these neural networks in the minimum mean-square-error sense, using the analytic linear least squares solution. The generated solution can be used as an initial condition to standard iterative optimization algorithms. However, simulations demonstrate that in most cases, the performance achieved through the proposed initialization scheme leaves little room for further improvement in the mean-square-error (MSE) over the training set. In addition, the performance of the network optimized with the proposed approach also generalizes well to testing data. A rigorous derivation of the initialization algorithm is presented and its high performance is verified with a number of benchmark training problems including chaotic time-series prediction, classification, and nonlinear system identification with MLPs. 相似文献
18.
Sang-Hoon Oh Youngjik Lee 《Neural Networks, IEEE Transactions on》1994,5(3):508-510
Nonlinear transformation is one of the major obstacles to analyzing the properties of multilayer perceptrons. In this letter, we prove that the correlation coefficient between two jointly Gaussian random variables decreases when each of them is transformed under continuous nonlinear transformations, which can be approximated by piecewise linear functions. When the inputs or the weights of a multilayer perceptron are perturbed randomly, the weighted sums to the hidden neurons are asymptotically jointly Gaussian random variables. Since sigmoidal transformation can be approximated piecewise linearly, the correlations among the weighted sums decrease under sigmoidal transformations. Based on this result, we can say that sigmoidal transformation used as the transfer function of the multilayer perceptron reduces redundancy in the information contents of the hidden neurons. 相似文献
19.
Specification of training sets and the number of hidden neurons for multilayer perceptrons. 总被引:1,自引:0,他引:1
This work concerns the selection of input-output pairs for improved training of multilayer perceptrons, in the context of approximation of univariate real functions. A criterion for the choice of the number of neurons in the hidden layer is also provided. The main idea is based on the fact that Chebyshev polynomials can provide approximations to bounded functions up to a prescribed tolerance, and, in turn, a polynomial of a certain order can be fitted with a three-layer perceptron with a prescribed number of hidden neurons. The results are applied to a sensor identification example. 相似文献
20.
Probability that a crisp logical rule applied to imprecise input data is true may be computed using fuzzy membership function (MF). All reasonable assumptions about input uncertainty distributions lead to MFs of sigmoidal shape. Convolution of several inputs with uniform uncertainty leads to bell-shaped Gaussian-like uncertainty functions. Relations between input uncertainties and fuzzy rules are systematically explored and several new types of MFs discovered. Multilayered perceptron (MLP) networks are shown to be a particular implementation of hierarchical sets of fuzzy threshold logic rules based on sigmoidal MFs. They are equivalent to crisp logical networks applied to input data with uncertainty. Leaving fuzziness on the input side makes the networks or the rule systems easier to understand. Practical applications of these ideas are presented for analysis of questionnaire data and gene expression data. 相似文献