首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Discusses the ability of multilayer perceptrons (MLPs) to model the probability distribution of data in typical pattern recognition and verification problems. It is proven that multilayer perceptrons with sigmoidal units and a number of hidden units less or equal than the number of inputs are unable to model patterns distributed in typical clusters, since these networks draw open separation surfaces in the pattern space. When using more hidden units than inputs, the separation surfaces can be closed but, unfortunately it is proven that determining whether or not a MLP draws closed separation surfaces in the pattern space is NP-hard. The major conclusion of the paper is somewhat opposite to what is believed and reported in many application papers: MLPs are definitely not adequate for applications of pattern recognition requiring a reliable rejection and, especially, they are not adequate for pattern verification tasks  相似文献   

2.
This work concerns the selection of input-output pairs for improved training of multilayer perceptrons, in the context of approximation of univariate real functions. A criterion for the choice of the number of neurons in the hidden layer is also provided. The main idea is based on the fact that Chebyshev polynomials can provide approximations to bounded functions up to a prescribed tolerance, and, in turn, a polynomial of a certain order can be fitted with a three-layer perceptron with a prescribed number of hidden neurons. The results are applied to a sensor identification example.  相似文献   

3.
Statistical active learning in multilayer perceptrons   总被引:2,自引:0,他引:2  
Proposes methods for generating input locations actively in gathering training data, aiming at solving problems unique to muitilayer perceptrons. One of the problems is that optimum input locations, which are calculated deterministically, sometimes distribute densely around the same point and cause local minima in backpropagation training. Two probabilistic active learning methods, which utilize the statistical variance of locations, are proposed to solve this problem. One is parametric active learning and the other is multipoint-search active learning. Another serious problem in applying active learning to multilayer perceptrons is that a Fisher information matrix can be singular, while many methods, including the proposed ones, assume its regularity. A technique of pruning redundant hidden units is proposed to keep the Fisher information matrix regular. Combined with this technique, active learning can be applied stably to multilayer perceptrons. The effectiveness of the proposed methods is demonstrated through computer simulations on simple artificial problems and a real-world problem of color conversion.  相似文献   

4.
A structure composed of local linear perceptrons for approximating global class discriminants is investigated. Such local linear models may be combined in a cooperative or competitive way. In the cooperative model, a weighted sum of the outputs of the local perceptrons is computed where the weight is a function of the distance between the input and the position of the local perceptron. In the competitive model, the cost function dictates a mixture model where only one of the local perceptrons give output. Learning of the local models' positions and the linear mappings they implement are coupled and both supervised. We show that this is preferable to the uncoupled case where the positions are trained in an unsupervised manner before the separate, supervised training of mappings. We use goodness criteria based on the cross-entropy and give learning equations for both the cooperative and competitive cases. The coupled and uncoupled versions of cooperative and competitive approaches are compared among themselves and with multilayer perceptrons of sigmoidal hidden units and radial basis functions (RBFs) of Gaussian units on the application of recognition of handwritten digits. The criteria of comparison are the generalization accuracy, learning time, and the number of free parameters. We conclude that even on such a high-dimensional problem, such local models are promising. They generalize much better than RBF's and use much less memory. When compared with multilayer perceptrons, we note that local models learn much faster and generalize as well and sometimes better with comparable number of parameters.  相似文献   

5.
多层感知器网络内部判决模式的研究   总被引:5,自引:0,他引:5  
人工神经网络(ANN)内部行为的研究,无论是对生物神经系统内部工作机理、ANN理论,还是对ANN应用都有重要意义。本文在作者原有工作基础上加以发展,针对多层感知器网络应用于模式识别、分类、 数逼近与参数估计的内部行为,作出了明确解释;以单陷层结构为典型,定义了隐层神经元输出为网络输出量的“(正、负)内部分量”,陷层权重分布为网络求解问题的“内部判决模式”;并给出了应用这一理论分析的实例。  相似文献   

6.
Two-Phase Construction of Multilayer Perceptrons Using Information Theory   总被引:2,自引:0,他引:2  
This brief presents a two-phase construction approach for pruning both input and hidden units of multilayer perceptrons (MLPs) based on mutual information (MI). First, all features of input vectors are ranked according to their relevance to target outputs through a forward strategy. The salient input units of an MLP are thus determined according to the order of the ranking result and by considering their contributions to the network's performance. Then, the irrelevant features of input vectors can be identified and eliminated. Second, the redundant hidden units are removed from the trained MLP one after another according to a novel relevance measure. Compared with its related work, the proposed strategy exhibits better performance. Moreover, experimental results show that the proposed method is comparable or even superior to support vector machine (SVM) and support vector regression (SVR). Finally, the advantages of the MI-based method are investigated in comparison with the sensitivity analysis (SA)-based method.  相似文献   

7.
In this paper, we address the problem of robustness in multilayer perceptrons. We present the main theoretical results in the case of linear neural networks with one hidden layer in order to go beyond the empirical study we previously made. We show that the robustness can greatly be improved and that even without decreasing performance in normal use. Finally, we show how this behavior, clearly demonstrated in the linear case, is an approximation of the behavior of nonlinear networks.  相似文献   

8.
This paper proposes a new error function at hidden layers to speed up the training of multilayer perceptrons (MLPs). With this new hidden error function, the layer-by-layer (LBL) algorithm approximately converges to the error backpropagation algorithm with optimum learning rates. Especially, the optimum learning rate for a hidden weight vector appears approximately as a multiplication of two optimum factors, one for minimizing the new hidden error function and the other for assigning hidden targets. Effectiveness of the proposed error function was demonstrated for handwritten digit recognition and isolated-word recognition tasks. Very fast learning convergence was obtained for MLPs without the stalling problem experienced in conventional LBL algorithms.  相似文献   

9.
多层感知器神经网络(MLPs)的学习过程经常发生一些奇异性行为,容易陷入平坦区,这都和MLPs的参数空间中存在的奇异性区域有直接关系.当MLPs的两个隐节点的权值接近互反时,置换对称性会导致学习困难.对MLPs的互反奇异性区域附近的学习动态进行分析.本文首先得到了平均学习方程的解析表达式,然后给出了互反奇异性区域附近的理论学习轨迹,并通过数值方法得到了其附近的实际学习轨迹.通过仿真实验,分别观察了MLPs的平均学习动态,批处理学习动态和在线学习动态,并进行了比较分析.  相似文献   

10.
This paper presents a part-of-speech tagging method based on a min-max modular neural-network model. The method has three main steps. First, a large-scale tagging problem is decomposed into a number of relatively smaller and simpler subproblems according to the class relations among a given training corpus. Secondly, all of the subproblems are learned by smaller network modules in parallel. Finally, following two simple module combination laws, all of the trained network modules are integrated into a modular parallel tagging system that produces solutions to the original tagging problem. The proposed method has several advantages over existing tagging systems based on multilayer perceptrons. (1) Training times can be drastically reduced and desired learning accuracy can be easily achieved; (2) the method can scale up to larger tagging problems; (3) the tagging system has quick response and facilitates hardware implementation. In order to demonstrate the effectiveness of the proposed method, we perform simulations on two different language corpora: a Thai corpus and a Chinese corpus, which have 29,028 and 45,595 ambiguous words, respectively. We also compare our method with several existing tagging models including hidden Markov models, multilayer perceptrons and neuro-taggers. The results show that both the learning accuracy and generalization performance of the proposed tagging model are better than statistical models and multilayer perceptrons, and they are comparable to the most successful tagging models.  相似文献   

11.
多层感知机在分类问题中具有广泛的应用。本文针对超平面阈值神经元构成的多层感知机用于分类的情况,求出了输入层神经元最多能把输入空间划分的区域数的解析表达式。该指标在很大程度上说明了感知机输入层的分类能力。本文还对隐含层神经元个数和输入层神经元个数之间的约束关系进行了讨论,得到了更准确的隐含层神经元个数上
上限。当分类空间的雏数远小于输入层神经元个数时,本文得到的隐含层神经元个数上限比现有的结果更小。  相似文献   

12.
A new multilayer incremental neural network (MINN) architecture and its performance in classification of biomedical images is discussed. The MINN consists of an input layer, two hidden layers and an output layer. The first stage between the input and first hidden layer consists of perceptrons. The number of perceptrons and their weights are determined by defining a fitness function which is maximized by the genetic algorithm (GA). The second stage involves feature vectors which are the codewords obtained automaticaly after learning the first stage. The last stage consists of OR gates which combine the nodes of the second hidden layer representing the same class. The comparative performance results of the MINN and the backpropagation (BP) network indicates that the MINN results in faster learning, much simpler network and equal or better classification performance.  相似文献   

13.
Fault diagnosis of analog circuits is a key problem in the theory of circuit networks and has been investigated by many researchers in recent decades. In this paper, an active filter circuit is used as the circuit under test (CUT) and is simulated in both fault-free and faulty conditions. A modular neural network model is proposed in this paper for soft fault diagnosis of the CUT. To optimize the structure of neural network modules in the proposed scheme, particle swarm optimization (PSO) algorithm is used to determine the number of hidden layer nodes of neural network modules. In addition, the output weight optimization–hidden weight optimization (OWO-HWO) training algorithm is employed, instead of conventional output weight optimization–backpropagation (OWO-BP) algorithm, to improve convergence speed in training of the neural network modules in proposed modular model. The performance of the proposed method is compared to that of monolithic multilayer perceptrons (MLPs) trained by OWO-BP and OWO-HWO algorithms, K-nearest neighbor (KNN) classifier and a related system with the same CUT. Experimental results show that the PSO-optimized modular neural network model which is trained by the OWO-HWO algorithm offers higher correct fault location rate in analog circuit fault diagnosis application as compared to the classic and monolithic investigated neural models.  相似文献   

14.
Nonlinear transformation is one of the major obstacles to analyzing the properties of multilayer perceptrons. In this letter, we prove that the correlation coefficient between two jointly Gaussian random variables decreases when each of them is transformed under continuous nonlinear transformations, which can be approximated by piecewise linear functions. When the inputs or the weights of a multilayer perceptron are perturbed randomly, the weighted sums to the hidden neurons are asymptotically jointly Gaussian random variables. Since sigmoidal transformation can be approximated piecewise linearly, the correlations among the weighted sums decrease under sigmoidal transformations. Based on this result, we can say that sigmoidal transformation used as the transfer function of the multilayer perceptron reduces redundancy in the information contents of the hidden neurons.  相似文献   

15.
We investigate the problem of learning two-layer neural nets of nonoverlapping perceptrons where each input unit is connected to one and only one hidden unit. We first show that this restricted problem with no overlap at all between the receptive fields of the hidden units is as hard as the general problem (with total overlap) if the learner uses examples only. However, if membership queries are allowed, the restricted problem is indeed easier to solve. We give a learning algorithm that uses examples and membership queries to PAC learn the intersection of K -nonoverlapping perceptrons, regardless of whether the instance space in Boolean, discrete, or continuous. An extension of this algorithm is proven to PAC learn two-layer nets with K -nonoverlapping perceptrons. The simulations performed indicate that both algorithms are fast and efficient.  相似文献   

16.
Bounds on the number of hidden neurons in multilayer perceptrons.   总被引:4,自引:0,他引:4  
Fundamental issues concerning the capability of multilayer perceptrons with one hidden layer are investigated. The studies are focused on realizations of functions which map from a finite subset of E(n) into E(d). Real-valued and binary-valued functions are considered. In particular, a least upper bound is derived for the number of hidden neurons needed to realize an arbitrary function which maps from a finite subset of E(n ) into E(d). A nontrivial lower bound is also obtained for realizations of injective functions. This result can be applied in studies of pattern recognition and database retrieval. An upper bound is given for realizing binary-valued functions that are related to pattern-classification problems.  相似文献   

17.
M.  J. 《Neurocomputing》2008,71(7-9):1321-1329
Bayesian information criterion (BIC) criterion is widely used by the neural-network community for model selection tasks, although its convergence properties are not always theoretically established. In this paper we will focus on estimating the number of components in a mixture of multilayer perceptrons and proving the convergence of the BIC criterion in this frame. The penalized marginal-likelihood for mixture models and hidden Markov models introduced by Keribin [Consistent estimation of the order of mixture models, Sankhya Indian J. Stat. 62 (2000) 49–66] and, respectively, Gassiat [Likelihood ratio inequalities with applications to various mixtures, Ann. Inst. Henri Poincare 38 (2002) 897–906] is extended to mixtures of multilayer perceptrons for which a penalized-likelihood criterion is proposed. We prove its convergence under some hypothesis which involve essentially the bracketing entropy of the generalized score-function class and illustrate it by some numerical examples.  相似文献   

18.
This paper presents the use of a neural network and a decision tree, which is evolved by genetic programming (GP), in thalassaemia classification. The aim is to differentiate between thalassaemic patients, persons with thalassaemia trait and normal subjects by inspecting characteristics of red blood cells, reticulocytes and platelets. A structured representation on genetic algorithms for non-linear function fitting or STROGANOFF is the chosen architecture for genetic programming implementation. For comparison, multilayer perceptrons are explored in classification via a neural network. The classification results indicate that the performance of the GP-based decision tree is approximately equal to that of the multilayer perceptron with one hidden layer. But the multilayer perceptron with two hidden layers, which is proven to have the most suitable architecture among networks with different number of hidden layers, outperforms the GP-based decision tree. Nonetheless, the structure of the decision tree reveals that some input features have no effects on the classification performance. The results confirm that the classification accuracy of the multilayer perceptron with two hidden layers can still be maintained after the removal of the redundant input features. Detailed analysis of the classification errors of the multilayer perceptron with two hidden layers, in which a reduced feature set is used as the network input, is also included. The analysis reveals that the classification ambiguity and misclassification among persons with minor thalassaemia trait and normal subjects is the main cause of classification errors. These results suggest that a combination of a multilayer perceptron with a blood cell analysis may give rise to a guideline/hint for further investigation of thalassaemia classification.  相似文献   

19.
Amari S  Park H  Ozeki T 《Neural computation》2006,18(5):1007-1065
The parameter spaces of hierarchical systems such as multilayer perceptrons include singularities due to the symmetry and degeneration of hidden units. A parameter space forms a geometrical manifold, called the neuromanifold in the case of neural networks. Such a model is identified with a statistical model, and a Riemannian metric is given by the Fisher information matrix. However, the matrix degenerates at singularities. Such a singular structure is ubiquitous not only in multilayer perceptrons but also in the gaussian mixture probability densities, ARMA time-series model, and many other cases. The standard statistical paradigm of the Cramér-Rao theorem does not hold, and the singularity gives rise to strange behaviors in parameter estimation, hypothesis testing, Bayesian inference, model selection, and in particular, the dynamics of learning from examples. Prevailing theories so far have not paid much attention to the problem caused by singularity, relying only on ordinary statistical theories developed for regular (nonsingular) models. Only recently have researchers remarked on the effects of singularity, and theories are now being developed.This article gives an overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures. We demonstrate our recent results on these problems. Simple toy models are also used to show explicit solutions. We explain that the maximum likelihood estimator is no longer subject to the gaussian distribution even asymptotically, because the Fisher information matrix degenerates, that the model selection criteria such as AIC, BIC, and MDL fail to hold in these models, that a smooth Bayesian prior becomes singular in such models, and that the trajectories of dynamics of learning are strongly affected by the singularity, causing plateaus or slow manifolds in the parameter space. The natural gradient method is shown to perform well because it takes the singular geometrical structure into account. The generalization error and the training error are studied in some examples.  相似文献   

20.
This paper presents learning multilayer Potts perceptrons (MLPotts) for data driven function approximation. A Potts perceptron is composed of a receptive field and a $K$ -state transfer function that is generalized from sigmoid-like transfer functions of traditional perceptrons. An MLPotts network is organized to perform translation from a high-dimensional input to the sum of multiple postnonlinear projections, each with its own postnonlinearity realized by a weighted $K$-state transfer function. MLPotts networks span a function space that theoretically covers network functions of multilayer perceptrons. Compared with traditional perceptrons, weighted Potts perceptrons realize more flexible postnonlinear functions for nonlinear mappings. Numerical simulations show MLPotts learning by the Levenberg–Marquardt (LM) method significantly improves traditional supervised learning of multilayer perceptrons for data driven function approximation.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号