首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A covariance matrix self-adaptation evolution strategy (CMSA-ES) was compared with several metaheuristic techniques for multilayer perceptron (MLP)-based function approximation and classification. Function approximation was based on simulations of several 2D functions and classification analysis was based on nine cancer DNA microarray data sets. Connection weight learning by MLPs was carried out using genetic algorithms (GA?CMLP), covariance matrix self-adaptation-evolution strategies (CMSA-ES?CMLP), back-propagation gradient-based learning (MLP), particle swarm optimization (PSO?CMLP), and ant colony optimization (ACO?CMLP). During function approximation runs, input-side activation functions evaluated included linear, logistic, tanh, Hermite, Laguerre, exponential, and radial basis functions, while the output-side function was always linear. For classification, the input-side activation function was always logistic, while the output-side function was always regularized softmax. Self-organizing maps and unsupervised neural gas were used to reduce dimensions of original gene expression input features used in classification. Results indicate that for function approximation, use of Hermite polynomials for activation functions at hidden nodes with CMSA-ES?CMLP connection weight learning resulted in the greatest fitness levels. On average, the most elite chromosomes were observed for MLP ( ${\rm MSE}=0.4977$ ), CMSA-ES?CMLP (0.6484), PSO?CMLP (0.7472), ACO?CMLP (1.3471), and GA?CMLP (1.4845). For classification analysis, overall average performance of classifiers used was 92.64% (CMSA-ES?CMLP), 92.22% (PSO?CMLP), 91.30% (ACO?CMLP), 89.36% (MLP), and 60.72% (GA?CMLP). We have shown that a reliable approach to function approximation can be achieved through application of MLP connection weight learning when the assumed function is unknown. In this scenario, the MLP architecture itself defines the equation used for solving the unknown parameters relating input and output target values. A major drawback of implementing CMSA-ES into an MLP is that when the number of MLP weights is large, the ${{\mathcal{O}}}(N^3)$ Cholesky factorization becomes a bottleneck for performance. As an alternative, feature reduction using SOM and NG can greatly enhance performance of CMSA-ES?CMLP by reducing $N.$ Future research into the speeding up of Cholesky factorization for CMSA-ES will be helpful in overcoming time complexity problems related to a large number of connection weights.  相似文献   

2.
Dynamical properties of a neural auto-associative memory with two-stage neurons are investigated theoretically. The two-stage neuron is a model whose output is determined by a two-stage nonlinear function of the internal field of the neuron (internal field is a weighted sum of outputs of the other neurons). The model is general, including nonmonotonic neurons as well as monotonic ones. Recent studies on associative memory revealed superiority of nonmonotonic neurons to monotonic ones. The present paper supplies theoretical verification on the high performance of nonmonotonic neurons and proves that the capacity of the auto-associative memory with two-stage neurons is O(n/ radicallog n), in contrast to O(n/log n) of simple threshold neurons. There is also a discussion of recall processes, where the radius of basin of attraction of memorized patterns is clarified. An intuitive explanation on why the performance is improved by nonmonotonic neurons is also provided by showing the correspondence of the recall processes of the two-stage-neuron net and orthogonal learning.  相似文献   

3.
Z. Zhu  H. He 《Information Sciences》2007,177(5):1180-1192
A new self-organizing learning array (SOLAR) system has been implemented in software. It is an information theory based learning machine capable of handling a wide variety of classification problems. It has self-reconfigurable processing cells (neurons) and an evolvable system structure. Entropy based learning is performed locally at each neuron, where neural functions and connections that correspond to the minimum entropy are adaptively learned. By choosing connections for each neuron, the system sets up the wiring and completes its self-organization. SOLAR classifies input data based on weighted statistical information from all neurons. Unlike artificial neural networks, its multi-layer structure scales well to large systems capable of solving complex pattern recognition and classification tasks. This paper shows its application in economic and financial fields. A reference to influence diagrams is also discussed. Several prediction and classification cases are studied. The results have been compared with the existing methods.  相似文献   

4.
The computational power of a neuron lies in the spatial grouping of synapses belonging to any dendrite tree. Attempts to give a mathematical representation to the grouping process of synapses continue to be a fascinating field of work for researchers in the neural network community. In the literature, we generally find neuron models that comprise of summation, radial basis or product aggregation function, as basic unit of feed-forward multilayer neural network. All these models and their corresponding networks have their own merits and demerits. The MLP constructs global approximation to input–output mapping, while a RBF network, using exponentially decaying localized non-linearity, constructs local approximation to input–output mapping. In this paper, we propose two compensatory type novel aggregation functions for artificial neurons. They produce net potential as linear or non-linear composition of basic summation and radial basis operations over a set of input signals. The neuron models based on these aggregation functions ensure faster convergence, better training and prediction accuracy. The learning and generalization capabilities of these neurons have been tested over various classification and functional mapping problems. These neurons have also shown excellent generalization ability over the two-dimensional transformations.  相似文献   

5.
In the conventional backpropagation (BP) learning algorithm used for the training of the connecting weights of the artificial neural network (ANN), a fixed slope−based sigmoidal activation function is used. This limitation leads to slower training of the network because only the weights of different layers are adjusted using the conventional BP algorithm. To accelerate the rate of convergence during the training phase of the ANN, in addition to updates of weights, the slope of the sigmoid function associated with artificial neuron can also be adjusted by using a newly developed learning rule. To achieve this objective, in this paper, new BP learning rules for slope adjustment of the activation function associated with the neurons have been derived. The combined rules both for connecting weights and slopes of sigmoid functions are then applied to the ANN structure to achieve faster training. In addition, two benchmark problems: classification and nonlinear system identification are solved using the trained ANN. The results of simulation-based experiments demonstrate that, in general, the proposed new BP learning rules for slope and weight adjustments of ANN provide superior convergence performance during the training phase as well as improved performance in terms of root mean square error and mean absolute deviation for classification and nonlinear system identification problems.  相似文献   

6.
Castillo  P. A.  Carpio  J.  Merelo  J. J.  Prieto  A.  Rivas  V.  Romero  G. 《Neural Processing Letters》2000,12(2):115-128
This paper proposes a new version of a method (G-Prop, genetic backpropagation) that attempts to solve the problem of finding appropriate initial weights and learning parameters for a single hidden layer Multilayer Perceptron (MLP) by combining an evolutionary algorithm (EA) and backpropagation (BP). The EA selects the MLP initial weights, the learning rate and changes the number of neurons in the hidden layer through the application of specific genetic operators, one of which is BP training. The EA works on the initial weights and structure of the MLP, which is then trained using QuickProp; thus G-Prop combines the advantages of the global search performed by the EA over the MLP parameter space and the local search of the BP algorithm. The application of the G-Prop algorithm to several real-world and benchmark problems shows that MLPs evolved using G-Prop are smaller and achieve a higher level of generalization than other perceptron training algorithms, such as QuickPropagation or RPROP, and other evolutive algorithms. It also shows some improvement over previous versions of the algorithm.  相似文献   

7.
Multilayer perceptron (MLP) networks trained using backpropagation can be slow to converge in many instances. The primary reason for slow learning is the global nature of backpropagation. Another reason is the fact that a neuron in an MLP network functions as a hyperplane separator and is therefore inefficient when applied to classification problems in which decision boundaries are nonlinear. This paper presents a data representational approach that addresses these problems while operating within the framework of the familiar backpropagation model. We examine the use of receptors with overlapping receptive fields as a preprocessing technique for encoding inputs to MLP networks. The proposed data representation scheme, termed ensemble encoding, is shown to promote local learning and to provide enhanced nonlinear separability. Simulation results for well known problems in classification and time-series prediction indicate that the use of ensemble encoding can significantly reduce the time required to train MLP networks. Since the choice of representation for input data is independent of the learning algorithm and the functional form employed in the MLP model, nonlinear preprocessing of network inputs may be an attractive alternative for many MLP network applications.  相似文献   

8.
A novel structure for radial basis function networks is proposed. In this structure, unlike traditional RBF, we set some weights between input and hidden layer. These weights, which take values around unity, are multiplication factors for input vector and perform a linear mapping. Doing this, we increase free parameters of the network, but since these weights are trainable, the overall performance of the network is improved significantly. According to the new weight vector, we called this structure Weighted RBF or WRBF. Weight adjustment formula is provided by applying the gradient descent algorithm. Two classification problems used to evaluate performance of the new RBF network: letter classification using UCI dataset with 16 features, a difficult problem, and digit recognition using HODA dataset with 64 features, an easy problem. WRBF is compared with classic RBF and MLP network, and our experiments show that WRBF outperforms both significantly. For example, in the case of 200 hidden neurons, WRBF achieved recognition rate of 92.78% on UCI dataset while RBF and MLP achieved 83.13 and 89.25% respectively. On HODA dataset, WRBF reached 97.94% recognition rate whereas RBF achieved 97.14%, and MLP accomplished 97.63%.  相似文献   

9.
多聚合过程神经元网络及其学习算法研究   总被引:2,自引:0,他引:2  
针对系统输入为多元过程函数以及多维过程信号的信息处理问题,提出了多聚合过程神经元和多聚合过程神经元网络模型.多聚合过程神经元的输入和连接权均可以是多元过程函数,其聚合运算包括对多个输入函数的空间加权聚集和对多维过程效应的累积,可同时反映多个多元过程输入信号在多维空间上的共同作用影响以及过程效应的累积结果.多聚合过程神经元网络是由多聚合过程神经元和其它类型的神经元按照一定的结构关系组成的网络模型,按照输出是否为多元过程函数建立了前馈多聚合过程神经元网络的一般模型和输入输出均为过程函数的多聚合过程神经元网络模型,具有对多元过程信号输入输出关系的直接映射和建模能力.文中给出了一种基于多元函数基展开的梯度下降与数值计算相结合的学习算法,仿真实验结果表明了模型和算法对多元过程信号分类和多维动态过程模拟问题的适应性.  相似文献   

10.
针对脉冲神经元基于精确定时的多脉冲编码信息的特点,提出了一种基于卷积计算的多层脉冲神经网络监督学习的新算法。该算法应用核函数的卷积计算将离散的脉冲序列转换为连续函数,在多层前馈脉冲神经网络结构中,使用梯度下降的方法得到基于核函数卷积表示的学习规则,并用来调整神经元连接的突触权值。在实验部分,首先验证了该算法学习脉冲序列的效果,然后应用该算法对Iris数据集进行分类。结果显示,该算法能够实现脉冲序列复杂时空模式的学习,对非线性模式分类问题具有较高的分类正确率。  相似文献   

11.
Implementing linearly nonseparable Boolean functions (non-LSBF) has been an important and yet challenging task due to the extremely high complexity of this kind of functions and the exponentially increasing percentage of the number of non-LSBF in the entire set of Boolean functions as the number of input variables increases. In this paper, an algorithm named DNA-like learning and decomposing algorithm (DNA-like LDA) is proposed, which is capable of effectively implementing non-LSBF. The novel algorithm first trains the DNA-like offset sequence and decomposes non-LSBF into logic xor operations of a sequence of LSBF, and then determines the weight-threshold values of the multilayer perceptron (MLP) that perform both the decompositions of LSBF and the function mapping the hidden neurons to the output neuron. The algorithm is validated by two typical examples about the problem of approximating the circular region and the well-known $n$ -bit parity Boolean function (PBF).   相似文献   

12.
Prior knowledge of the input–output problems often leads to supervised learning restrictions that can hamper the multi-layered perceptron’s (MLP) capacity to find an optimal solution. Restrictions such as fixing weights and modifying input variables may influence the potential convergence of the back-propagation algorithm. This paper will show mathematically how to handle such constraints in order to obtain a modified version of the traditional MLP capable of solving targeted problems. More specifically, it will be shown that fixing particular weights according to prior information as well as transforming incoming inputs can enable the user to limit the MLP search to a desired type of solution. The ensuing modifications pertaining to the learning algorithm will be established. Moreover, four supervised improvements will offer insight on how to control the convergence of the weights towards an optimal solution. Finally, applications involving packing and covering problems will be used to illustrate the potential and performance of this modified MLP.  相似文献   

13.
An important issue in the design and implementation of a neural network is the sensitivity of its output to input and weight perturbations. In this paper, we discuss the sensitivity of the most popular and general feedforward neural networks-multilayer perceptron (MLP). The sensitivity is defined as the mathematical expectation of the output errors of the MLP due to input and weight perturbations with respect to all input and weight values in a given continuous interval. The sensitivity for a single neuron is discussed first and an analytical expression that is a function of the absolute values of input and weight perturbations is approximately derived. Then an algorithm is given to compute the sensitivity for the entire MLP. As intuitively expected, the sensitivity increases with input and weight perturbations, but the increase has an upper bound that is determined by the structural configuration of the MLP, namely the number of neurons per layer and the number of layers. There exists an optimal value for the number of neurons in a layer, which yields the highest sensitivity value. The effect caused by the number of layers is quite unexpected. The sensitivity of a neural network may decrease at first and then almost keeps constant while the number increases.  相似文献   

14.
Model compression is required when large models are used, for example, for a classification task, but there are transmission, space, time, or computing constraints that have to be fulfilled. Multilayer perceptron (MLP) models have been traditionally used as classifiers. Depending on the problem, they may need a large number of parameters (neuron functions, weights, and bias) to obtain an acceptable performance. This work proposes a technique to compress an array of MLPs, through the weights of a Volterra-neural network (Volterra-NN), maintaining its classification performance. It will be shown that several MLP topologies can be well-compressed into the first-, second-, and third-order (Volterra-NN) outputs. The obtained results show that these outputs can be used to build an array of (Volterra-NN) that needs significantly less parameters than the original array of MLPs, furthermore having the same high accuracy. The Volterra-NN compression capabilities were tested for solving a face recognition problem. Experimental results are presented on two well-known face databases: ORL and FERET.  相似文献   

15.
An attempt has been made to improve the performance of Deep Learning with Multilayer Perceptron (MLP). Tuning the learning rate or finding an optimum learning rate in MLP is a major challenge. Depending on the value of the learning rate, classification accuracy can vary drastically. This issue has been taken as a challenge in this paper. In this paper, a new approach has been proposed to combine adaptive learning rate in conjunction with the concept of Laplacian score for varying the weights. Learning rate is taken as a function of parameter which itself is updated on the basis of error gradient by forming mini-batches. Laplacian score of the neuron is further used for updating the incoming weights. This removes the bottleneck involved in finding the optimum value for the learning rate in Deep Learning by using MLP. It is observed on benchmark datasets that this approach leads to increase in classification accuracy as compared to the existing benchmark levels achieved by the well known methods of deep learning.  相似文献   

16.
17.
通过分析当前运用较多的入侵检测模型的缺陷,提出了一种基于样条权函数神经网络的新型入侵检测系统模型。网络拓扑结构简单,网络训练所需要的神经元个数与样本个数无关。训练后的权函数由三次样条函数构成,而不是传统方法中的常数。该模型克服了传统入侵检测系统所存在的局部极小、收敛速度慢、初值敏感性等问题。  相似文献   

18.
Sensitivity analysis on a neural network is mainly investigated after the network has been designed and trained. Very few have considered this as a critical issue prior to network design. Piche's statistical method (1992, 1995) is useful for multilayer perceptron (MLP) design, but too severe limitations are imposed on both input and weight perturbations. This paper attempts to generalize Piche's method by deriving an universal expression of MLP sensitivity for antisymmetric squashing activation functions, without any restriction on input and output perturbations. Experimental results which are based on, a three-layer MLP with 30 nodes per layer agree closely with our theoretical investigations. The effects of the network design parameters such as the number of layers, the number of neurons per layer, and the chosen activation function are analyzed, and they provide useful information for network design decision-making. Based on the sensitivity analysis of MLP, we present a network design method for a given application to determine the network structure and estimate the permitted weight range for network training.  相似文献   

19.
In this paper, we present a fast learning fully complex-valued extreme learning machine classifier, referred to as ‘Circular Complex-valued Extreme Learning Machine (CC-ELM)’ for handling real-valued classification problems. CC-ELM is a single hidden layer network with non-linear input and hidden layers and a linear output layer. A circular transformation with a translational/rotational bias term that performs a one-to-one transformation of real-valued features to the complex plane is used as an activation function for the input neurons. The neurons in the hidden layer employ a fully complex-valued Gaussian-like (‘sech’) activation function. The input parameters of CC-ELM are chosen randomly and the output weights are computed analytically. This paper also presents an analytical proof to show that the decision boundaries of a single complex-valued neuron at the hidden and output layers of CC-ELM consist of two hyper-surfaces that intersect orthogonally. These orthogonal boundaries and the input circular transformation help CC-ELM to perform real-valued classification tasks efficiently.Performance of CC-ELM is evaluated using a set of benchmark real-valued classification problems from the University of California, Irvine machine learning repository. Finally, the performance of CC-ELM is compared with existing methods on two practical problems, viz., the acoustic emission signal classification problem and a mammogram classification problem. These study results show that CC-ELM performs better than other existing (both) real-valued and complex-valued classifiers, especially when the data sets are highly unbalanced.  相似文献   

20.
The advances in biophysics of computations and neurocomputing models have brought the foreground importance of dendritic structure of neuron. These structures are assumed as basic computational units of the neuron, capable of realizing the various mathematical operations. The well structured higher order neurons have shown improved computational power and generalization ability. However, these models are difficult to train because of a combinatorial explosion of higher order terms as the number of inputs to the neuron increases. In this paper we present a neural network using new neuron architecture i.e., generalized mean neuron (GMN) model. This neuron model consists of an aggregation function which is based on the generalized mean of all the inputs applied to it. The resulting neuron model has the same number of parameters with improved computational power as the existing multilayer perceptron (MLP) model. The capability of this model has been tested on the classification and time series prediction problems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号