首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
Evolutionary selection extreme learning machine optimization for regression   总被引:2,自引:1,他引:1  
Neural network model of aggression can approximate unknown datasets with the less error. As an important method of global regression, extreme learning machine (ELM) represents a typical learning method in single-hidden layer feedforward network, because of the better generalization performance and the faster implementation. The “randomness” property of input weights makes the nonlinear combination reach arbitrary function approximation. In this paper, we attempt to seek the alternative mechanism to input connections. The idea is derived from the evolutionary algorithm. After predefining the number L of hidden nodes, we generate original ELM models. Each hidden node is seemed as a gene. To rank these hidden nodes, the larger weight nodes are reassigned for the updated ELM. We put L/2 trivial hidden nodes in a candidate reservoir. Then, we generate L/2 new hidden nodes to combine L hidden nodes from this candidate reservoir. Another ranking is used to choose these hidden nodes. The fitness-proportional selection may select L/2 hidden nodes and recombine evolutionary selection ELM. The entire algorithm can be applied for large-scale dataset regression. The verification shows that the regression performance is better than the traditional ELM and Bayesian ELM under less cost gain.  相似文献   

2.
In this paper, the visual quality recognition of nonwovens is considered as a common problem of pattern recognition that will be solved by a joint approach by combining wavelet energy signatures, Bayesian neural network, and outlier detection. In this research, 625 nonwovens images of 5 different grades, 125 each grade, are decomposed at 4 levels with wavelet base sym6, then two energy signatures, norm-1 L1 and norm-2 L2 are calculated from wavelet coefficients of each high frequency subband to train and test Bayesian neural network. To detect the outlier of training set, scaled outlier probability of training set and outlier probability of each sample are introduced. The committees of networks and the evidence criterion are employed to select the ‘most suitable’ model, given a set of candidate networks which has different numbers of hidden neurons. However, in our research with the finite industrial data, we take both the evidence criterion and the actual performance into account to determine the structure of Bayesian neural network. When the nonwoven images are decomposed at level 4, with 500 samples to training the Bayesian neural network that has 3 hidden neurons, the average recognition accuracy of test set is 99.2%. Experimental results on the 625 nonwoven images indicate that the wavelet energy signatures are expressive and powerful in characterizing texture of nonwoven images and the robust Bayesian neural network has excellent recognition performance.  相似文献   

3.
The present study aims at developing an artificial neural network (ANN) to predict the compressive strength of concrete. A data set containing a total of 72 concrete samples was used in the study. The following constituted the concrete mixture parameters: two distinct w/c ratios (0.63 and 0.70), three different types of cements and three different cure conditions. Measurement of compressive strengths was performed at 3, 7, 28 and 90 days. Two different ANN models were developed, one with 4 input and 1 output layers, 9 neurons and 1 hidden layer, and the other with 5, 6 neurons, 2 hidden layers. For the training of the developed models, 60 experimental data sets obtained prior to the process were used. The 12 experimental data not used in the training stage were utilized to test ANN models. The researchers have reached the conclusion that ANN provides a good alternative to the existing compressive strength prediction methods, where different cements, ages and cure conditions were used as input parameters.  相似文献   

4.
This paper presents a performance enhancement scheme for the recently developed extreme learning machine (ELM) for multi-category sparse data classification problems. ELM is a single hidden layer neural network with good generalization capabilities and extremely fast learning capacity. In ELM, the input weights are randomly chosen and the output weights are analytically calculated. The generalization performance of the ELM algorithm for sparse data classification problem depends critically on three free parameters. They are, the number of hidden neurons, the input weights and the bias values which need to be optimally chosen. Selection of these parameters for the best performance of ELM involves a complex optimization problem.In this paper, we present a new, real-coded genetic algorithm approach called ‘RCGA-ELM’ to select the optimal number of hidden neurons, input weights and bias values which results in better performance. Two new genetic operators called ‘network based operator’ and ‘weight based operator’ are proposed to find a compact network with higher generalization performance. We also present an alternate and less computationally intensive approach called ‘sparse-ELM’. Sparse-ELM searches for the best parameters of ELM using K-fold validation. A multi-class human cancer classification problem using micro-array gene expression data (which is sparse), is used for evaluating the performance of the two schemes. Results indicate that the proposed RCGA-ELM and sparse-ELM significantly improve ELM performance for sparse multi-category classification problems.  相似文献   

5.
A sequential orthogonal approach to the building and training of a single hidden layer neural network is presented in this paper. The Sequential Learning Neural Network (SLNN) model proposed by Zhang and Morris [1]is used in this paper to tackle the common problem encountered by the conventional Feed Forward Neural Network (FFNN) in determining the network structure in the number of hidden layers and the number of hidden neurons in each layer. The procedure starts with a single hidden neuron and sequentially increases in the number of hidden neurons until the model error is sufficiently small. The classical Gram–Schmidt orthogonalization method is used at each step to form a set of orthogonal bases for the space spanned by output vectors of the hidden neurons. In this approach it is possible to determine the necessary number of hidden neurons required. However, for the problems investigated in this paper, one hidden neuron itself is sufficient to achieve the desired accuracy. The neural network architecture has been trained and tested on two practical civil engineering problems – soil classification, and the prediction o strength and workability of high performance concrete.  相似文献   

6.
In this research, artificial neural network (ANN) model having three layers was developed for precise estimation of Cr(III) sorption rate varying from 17% to 99% by commercial resins as a result of obtaining 38 experimental data. ANN was trained by using the data of sorption process obtained at different pH (2–7) values with Amberjet 1200H and Diaion CR11 amount (0.01–0.1 g) dosage, initial metal concentration (4.6–31.7 ppm), contact time (5–240 min), and a temperature of 25°C. A feed-forward back propagation network type with one hidden layer, different algorithm (transcg, trainlm, traingdm, traincgp, and trainrp), different transfer function (logsig, tansig, and purelin) for hidden layer and purelin transfer function for output layer were used, respectively. Each model trained for cross-validation was compared with the data that were not used. The trainlm algorithm and purelin transfer functions with five neurons were well fitted to training data and cross-validation. After the best suitable coefficient of determination and mean squared error values were found in the current network, optimal result was searched by changing the number of neurons range from 1 to 20 in the current network hidden layer.  相似文献   

7.
Considering the uncertainty of hidden neurons, choosing significant hidden nodes, called as model selection, has played an important role in the applications of extreme learning machines(ELMs). How to define and measure this uncertainty is a key issue of model selection for ELM. From the information geometry point of view, this paper presents a new model selection method of ELM for regression problems based on Riemannian metric. First, this paper proves theoretically that the uncertainty can be characterized by a form of Riemannian metric. As a result, a new uncertainty evaluation of ELM is proposed through averaging the Riemannian metric of all hidden neurons. Finally, the hidden nodes are added to the network one by one, and at each step, a multi-objective optimization algorithm is used to select optimal input weights by minimizing this uncertainty evaluation and the norm of output weight simultaneously in order to obtain better generalization performance. Experiments on five UCI regression data sets and cylindrical shell vibration data set are conducted, demonstrating that the proposed method can generally obtain lower generalization error than the original ELM, evolutionary ELM, ELM with model selection, and multi-dimensional support vector machine. Moreover, the proposed algorithm generally needs less hidden neurons and computational time than the traditional approaches, which is very favorable in engineering applications.  相似文献   

8.
This work is dedicated to develop an algorithm for the visual quality recognition of nonwoven materials, in which image analysis and neural network are involved in feature extraction and pattern recognition stage, respectively. During the feature extraction stage, each image is decomposed into four levels using the 9-7 bi-orthogonal wavelet base. Then the wavelet coefficients in each subband are independently modeled by the generalized Gaussian density (GGD) model to calculate the scale and shape parameters with maximum likelihood (ML) estimator as texture features. While for the recognition stage, the robust Bayesian neural network is employed to classify the 625 nonwoven samples into five visual quality grades, i.e., 125 samples for each grade. Finally, we carry out the outlier detection of the training set using the outlier probability and select the most suitable model structure and parameters from 40 Bayesian neural networks using the Occam's razor. When 18 relevant textural features are extracted for each sample based on the GGD model, the average recognition accuracy of the test set arranges from 88% to 98.4% according to the different number of the hidden neurons in the Bayesian neural network.  相似文献   

9.
This paper studies the classification mechanisms of multilayer perceptrons (MLPs) with sigmoid activation functions (SAFs). The viewpoint is presented that in the input space the hyperplanes determined by the hidden basis functions with values 0's do not play the role of decision boundaries, and such hyperplanes do not certainly go through the marginal regions between different classes. For solving an n-class problem, a single-hidden-layer perceptron with at least log2(n-1)?2 hidden nodes is needed. The final number of hidden neurons is still related to the sample distribution shapes and regions, but not to the number of samples and input dimensions. As a result, an empirical formula for optimally selecting the initial number of hidden nodes is proposed. The ranks of response matrixes of hidden layers should be taken as a main basis for pruning or growing the existing hidden neurons. A structure-fixed perceptron ought to learn more than one round from different starting weight points for one classification task, and only the group of weights and biases that has the best generalization performance should be reserved. Finally, three examples are given to verify the above viewpoints.  相似文献   

10.
As a novel learning algorithm for single-hidden-layer feedforward neural networks, extreme learning machines (ELMs) have been a promising tool for regression and classification applications. However, it is not trivial for ELMs to find the proper number of hidden neurons due to the nonoptimal input weights and hidden biases. In this paper, a new model selection method of ELM based on multi-objective optimization is proposed to obtain compact networks with good generalization ability. First, a new leave-one-out (LOO) error bound of ELM is derived, and it can be calculated with negligible computational cost once the ELM training is finished. Furthermore, the hidden nodes are added to the network one-by-one, and at each step, a multi-objective optimization algorithm is used to select optimal input weights by minimizing this LOO bound and the norm of output weight simultaneously in order to avoid over-fitting. Experiments on five UCI regression data sets are conducted, demonstrating that the proposed algorithm can generally obtain better generalization performance with more compact network than the conventional gradient-based back-propagation method, original ELM and evolutionary ELM.  相似文献   

11.
In this paper, we present a fast learning fully complex-valued extreme learning machine classifier, referred to as ‘Circular Complex-valued Extreme Learning Machine (CC-ELM)’ for handling real-valued classification problems. CC-ELM is a single hidden layer network with non-linear input and hidden layers and a linear output layer. A circular transformation with a translational/rotational bias term that performs a one-to-one transformation of real-valued features to the complex plane is used as an activation function for the input neurons. The neurons in the hidden layer employ a fully complex-valued Gaussian-like (‘sech’) activation function. The input parameters of CC-ELM are chosen randomly and the output weights are computed analytically. This paper also presents an analytical proof to show that the decision boundaries of a single complex-valued neuron at the hidden and output layers of CC-ELM consist of two hyper-surfaces that intersect orthogonally. These orthogonal boundaries and the input circular transformation help CC-ELM to perform real-valued classification tasks efficiently.Performance of CC-ELM is evaluated using a set of benchmark real-valued classification problems from the University of California, Irvine machine learning repository. Finally, the performance of CC-ELM is compared with existing methods on two practical problems, viz., the acoustic emission signal classification problem and a mammogram classification problem. These study results show that CC-ELM performs better than other existing (both) real-valued and complex-valued classifiers, especially when the data sets are highly unbalanced.  相似文献   

12.
Variational Bayesian extreme learning machine   总被引:1,自引:0,他引:1  
Extreme learning machine (ELM) randomly generates parameters of hidden nodes and then analytically determines the output weights with fast learning speed. The ill-posed problem of parameter matrix of hidden nodes directly causes unstable performance, and the automatical selection problem of the hidden nodes is critical to holding the high efficiency of ELM. Focusing on the ill-posed problem and the automatical selection problem of the hidden nodes, this paper proposes the variational Bayesian extreme learning machine (VBELM). First, the Bayesian probabilistic model is involved into ELM, where the Bayesian prior distribution can avoid the ill-posed problem of hidden node matrix. Then, the variational approximation inference is employed in the Bayesian model to compute the posterior distribution and the independent variational hyperparameters approximately, which can be used to select the hidden nodes automatically. Theoretical analysis and experimental results elucidate that VBELM has stabler performance with more compact architectures, which presents probabilistic predictions comparison with traditional point predictions, and it also provides the hyperparameter criterion for hidden node selection.  相似文献   

13.
We present a texture analysis methodology that combined uncommitted machine-learning techniques and partial least square (PLS) in a fully automatic framework. Our approach introduces a robust PLS-based dimensionality reduction (DR) step to specifically address outliers and high-dimensional feature sets. The texture analysis framework was applied to diagnosis of knee osteoarthritis (OA). To classify between healthy subjects and OA patients, a generic bank of texture features was extracted from magnetic resonance images of tibial knee bone. The features were used as input to the DR algorithm, which first applied a PLS regression to rank the features and then defined the best number of features to retain in the model by an iterative learning phase. The outliers in the dataset, that could inflate the number of selected features, were eliminated by a pre-processing step. To cope with the limited number of samples, the data were evaluated using Monte Carlo cross validation (CV). The developed DR method demonstrated consistency in selecting a relatively homogeneous set of features across the CV iterations. Per each CV group, a median of 19 % of the original features was selected and considering all CV groups, the methods selected 36 % of the original features available. The diagnosis evaluation reached a generalization area-under-the-ROC curve of 0.92, which was higher than established cartilage-based markers known to relate to OA diagnosis.  相似文献   

14.
This article discusses the use of design of computer experiments (DOCE) (i.e., experiments run with a computer model to find how a set of inputs affects a set of outputs) to obtain a force–displacement meta-model (i.e., a mathematical equation that summarizes and aids in analyzing the input–output data of a DOCE) of compliant mechanisms (CMs). The procedure discussed produces a force–displacement meta-model, or closed analytic vector function, that aims to control CMs in real-time. In our work, the factorial and space-filling DOCE meta-model of CMs is supported by finite element analysis (FEA). The protocol discussed is used to model the HexFlex mechanism functioning under quasi-static conditions. The HexFlex is a parallel CM for nano-manipulation that allows six degrees of freedom (xyz, θ x , θ y , θ z ) of its moving platform. In the multi-linear model fit of the HexFlex, the products or interactions proved to be negligible, yielding a linear model (i.e., linear in the inputs) for the operating range. The accuracy of the meta-model was calculated by conducting a set of computer experiments with random uniform distribution of the input forces. Three error criteria were recorded comparing the meta-model prediction with respect to the results of the FEA experiments by determining: (1) maximum of the absolute value of the error, (2) relative error, and (3) root mean square error. The maximum errors of our model are lower than high-precision manufacturing tolerances and are also lower than those reported by other researchers who have tried to fit meta-models to the HexFlex mechanism.  相似文献   

15.
《Applied Soft Computing》2007,7(3):1112-1120
In this paper, an artificial neural network (ANN) model is proposed to predict the first lactation 305-day milk yield (FLMY305) using partial lactation records pertaining to the Karan Fries (KF) crossbred dairy cattle. A scientifically determined optimum dataset of representative breeding traits of the cattle is used to develop the model.Several training algorithms, viz., (i) gradient descent algorithm with adaptive learning rate; (ii) Fletcher–Reeves conjugate gradient algorithm; (iii) Polak–Ribiére conjugate gradient algorithm; (iv) Powell–Beale conjugate gradient algorithm; (v) Quasi-Newton algorithm with Broyden, Fletcher, Goldfarb, and Shanno (BFGS) update; and (vi) Levenberg–Marquardt algorithm with Bayesian regularization; along with various network architectural parameters, i.e., data partitioning strategy, initial synaptic weights, number of hidden layers, number of neurons in each hidden layer, activation functions, regularization factor, etc., are experimentally investigated to arrive at the best model for predicting the FLMY305.Also, a multiple linear regression (MLR) model is developed for the milk-yield prediction. The performances of ANN and MLR models are compared to assess the relative prediction capability of the former model.It emerges from this study that the performance of ANN model seems to be slightly superior to that of the conventional regression model. Hence, it is recommended that the ANNs can potentially be used as an alternative technique to predict FLMY305 in the KF cattle.  相似文献   

16.
Virtual reality (VR) devices have recently become popular; however, research on input methods for VR devices is lacking. The main input methods of current commercial devices can be classified into two categories: manual selection using a controller and gaze selection. This study aims to derive the optimal input method and timing for VR devices by analyzing the performance of these input methods. A study is conducted in which participants wear a VR headset and select an activated input button from a 3 × 3 array on a VR device with two button sizes and two input methods. The manual selection method exhibits a shorter task completion time but a greater error number compared to the gaze selection method. For the gaze selection method, the task completion time and target search time were shortest when the gaze timing was 1 s. Further, the button size was determined to be statistically significant only when the manual selection method was used. The results of this study can be used as a reference in future VR user experience and product design.  相似文献   

17.
多层感知机在分类问题中具有广泛的应用。本文针对超平面阈值神经元构成的多层感知机用于分类的情况,求出了输入层神经元最多能把输入空间划分的区域数的解析表达式。该指标在很大程度上说明了感知机输入层的分类能力。本文还对隐含层神经元个数和输入层神经元个数之间的约束关系进行了讨论,得到了更准确的隐含层神经元个数上
上限。当分类空间的雏数远小于输入层神经元个数时,本文得到的隐含层神经元个数上限比现有的结果更小。  相似文献   

18.
Abstract: A multilayer perceptron is known to be capable of approximating any smooth function to any desired accuracy if it has a sufficient number of hidden neurons. But its training, based on the gradient method, is usually a time consuming procedure that may converge toward a local minimum, and furthermore its performance is greatly influenced by the number of hidden neurons and their initial weights. Usually these crucial parameters are determined based on the trial and error procedure, requiring much experience on the designer's part.
In this paper, a constructive design method (CDM) has been proposed for a two-layer perceptron that can approximate a class of smooth functions whose feature vector classes are linearly separable. Based on the analysis of a given data set sampled from the target function, feature vectors that can characterize the function'well'are extracted and used to determine the number of hidden neurons and the initial weights of the network. But when the classes of the feature vectors are not linearly separable, the network may not be trained easily, mainly due to the interference among the hyperplanes generated by hidden neurons. Next, to compensate for this interference, a refined version of the modular neural network (MNN) has been proposed where each network module is created by CDM. After the input space has been partitioned into many local regions, a two-layer perceptron constructed by CDM is assigned to each local region. By doing this, the feature vector classes are more likely to become linearly separable in each local region and as a result, the function may be approximated with greatly improved accuracy by MNN. An example simulation illustrates the improvements in learning speed using a smaller number of neurons.  相似文献   

19.
Extreme learning machine (ELM) [G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: a new learning scheme of feedforward neural networks, in: Proceedings of the International Joint Conference on Neural Networks (IJCNN2004), Budapest, Hungary, 25-29 July 2004], a novel learning algorithm much faster than the traditional gradient-based learning algorithms, was proposed recently for single-hidden-layer feedforward neural networks (SLFNs). However, ELM may need higher number of hidden neurons due to the random determination of the input weights and hidden biases. In this paper, a hybrid learning algorithm is proposed which uses the differential evolutionary algorithm to select the input weights and Moore-Penrose (MP) generalized inverse to analytically determine the output weights. Experimental results show that this approach is able to achieve good generalization performance with much more compact networks.  相似文献   

20.
Huang et al. (2004) has recently proposed an on-line sequential ELM (OS-ELM) that enables the extreme learning machine (ELM) to train data one-by-one as well as chunk-by-chunk. OS-ELM is based on recursive least squares-type algorithm that uses a constant forgetting factor. In OS-ELM, the parameters of the hidden nodes are randomly selected and the output weights are determined based on the sequentially arriving data. However, OS-ELM using a constant forgetting factor cannot provide satisfactory performance in time-varying or nonstationary environments. Therefore, we propose an algorithm for the OS-ELM with an adaptive forgetting factor that maintains good performance in time-varying or nonstationary environments. The proposed algorithm has the following advantages: (1) the proposed adaptive forgetting factor requires minimal additional complexity of O(N) where N is the number of hidden neurons, and (2) the proposed algorithm with the adaptive forgetting factor is comparable with the conventional OS-ELM with an optimal forgetting factor.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号