首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 108 毫秒
1.
Alternating minimization and Boltzmann machine learning   总被引:1,自引:0,他引:1  
Training a Boltzmann machine with hidden units is appropriately treated in information geometry using the information divergence and the technique of alternating minimization. The resulting algorithm is shown to be closely related to gradient descent Boltzmann machine learning rules, and the close relationship of both to the EM algorithm is described. An iterative proportional fitting procedure for training machines without hidden units is described and incorporated into the alternating minimization algorithm.  相似文献   

2.
周立军  刘凯  吕海燕 《计算机应用》2018,38(7):1872-1876
针对受限玻尔兹曼机(RBM)无监督训练存在特征同质化问题以及现有稀疏受限玻尔兹曼机(SRBM)难以自适应稀疏的缺陷,提出了一种基于竞争学习的RBM稀疏机制方法。首先设计基于神经元权值向量与输入向量间夹角余弦值的距离度量,评估两者相似度;然后在训练过程中对不同样本选择出基于距离度量的最优匹配隐单元;其次根据最优匹配隐单元激活状态计算对其他隐单元的稀疏惩罚度;最后执行参数更新并依据深度模型训练过程,将竞争稀疏应用于深度玻尔兹曼机(DBM)的构建中。通过手写数字识别实验证明,与误差平方和正则化因子相比,基于该稀疏机制的DBM分类准确率提高了0.74%,平均稀疏度提高了5.6%,且无需设置稀疏参数,因此,该稀疏机制可提高RBM等无监督训练模型的训练效率,并应用于深度模型的构建中。  相似文献   

3.
In this paper, we present two learning mechanisms for artificial neural networks (ANN's) that can be applied to solve classification problems with binary outputs. These mechanisms are used to reduce the number of hidden units of an ANN when trained by the cascade-correlation learning algorithm (CAS). Since CAS adds hidden units incrementally as learning proceeds, it is difficult to predict the number of hidden units required when convergence is reached. Further, learning must be restarted when the number of hidden units is larger than expected. Our key idea in this paper is to provide alternatives in the learning process and to select the best alternative dynamically based on run-time information obtained. Mixed-mode learning (MM), our first algorithm, provides alternative output matrices so that learning is extended to find one of the many one-to-many mappings instead of finding a unique one-to-one mapping. Since the objective of learning is relaxed by this transformation, the number of learning epochs can be reduced. This in turn leads to a smaller number of hidden units required for convergence. Population-based learning for ANN's (PLAN), our second algorithm, maintains alternative network configurations to select at run time promising networks to train based on error information obtained and time remaining. This dynamic scheduling avoids training possibly unpromising ANNs to completion before exploring new ones. We show the performance of these two mechanisms by applying them to solve the two-spiral problem, a two-region classification problem, and the Pima Indian diabetes diagnosis problem.  相似文献   

4.
We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a single mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer pretraining phase that initializes the weights sensibly. The pretraining also allows the variational inference to be initialized sensibly with a single bottom-up pass. We present results on the MNIST and NORB data sets showing that deep Boltzmann machines learn very good generative models of handwritten digits and 3D objects. We also show that the features discovered by deep Boltzmann machines are a very effective way to initialize the hidden layers of feedforward neural nets, which are then discriminatively fine-tuned.  相似文献   

5.
Gradient calculations for dynamic recurrent neural networks: asurvey   总被引:6,自引:0,他引:6  
Surveys learning algorithms for recurrent neural networks with hidden units and puts the various techniques into a common framework. The authors discuss fixed point learning algorithms, namely recurrent backpropagation and deterministic Boltzmann machines, and nonfixed point algorithms, namely backpropagation through time, Elman's history cutoff, and Jordan's output feedback architecture. Forward propagation, an on-line technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the unified presentation leads to generalizations of various sorts. The author discusses advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones continues with some "tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. The author presents some simulations, and at the end, addresses issues of computational complexity and learning speed.  相似文献   

6.
Ning  Meng Joo  Xianyao   《Neurocomputing》2009,72(16-18):3818
In this paper, we present a fast and accurate online self-organizing scheme for parsimonious fuzzy neural networks (FAOS-PFNN), where a novel structure learning algorithm incorporating a pruning strategy into new growth criteria is developed. The proposed growing procedure without pruning not only speeds up the online learning process but also facilitates a more parsimonious fuzzy neural network while achieving comparable performance and accuracy by virtue of the growing and pruning strategy. The FAOS-PFNN starts with no hidden neurons and parsimoniously generates new hidden units according to the proposed growth criteria as learning proceeds. In the parameter learning phase, all the free parameters of hidden units, regardless of whether they are newly created or originally existing, are updated by the extended Kalman filter (EKF) method. The effectiveness and superiority of the FAOS-PFNN paradigm is compared with other popular approaches like resource allocation network (RAN), RAN via the extended Kalman filter (RANEKF), minimal resource allocation network (MRAN), adaptive-network-based fuzzy inference system (ANFIS), orthogonal least squares (OLS), RBF-AFS, dynamic fuzzy neural networks (DFNN), generalized DFNN (GDFNN), generalized GAP-RBF (GGAP-RBF), online sequential extreme learning machine (OS-ELM) and self-organizing fuzzy neural network (SOFNN) on various benchmark problems in the areas of function approximation, nonlinear dynamic system identification, chaotic time-series prediction and real-world regression problems. Simulation results demonstrate that the proposed FAOS-PFNN algorithm can achieve faster learning speed and more compact network structure with comparably high accuracy of approximation and generalization.  相似文献   

7.
The subspace restricted Boltzmann machine (subspaceRBM) is a third-order Boltzmann machine where multiplicative interactions are between one visible and two hidden units. There are two kinds of hidden units, namely, gate units and subspace units. The subspace units reflect variations of a pattern in data and the gate unit is responsible for activating the subspace units. Additionally, the gate unit can be seen as a pooling feature. We evaluate the behavior of subspaceRBM through experiments with MNIST digit recognition task and Caltech 101 Silhouettes image corpora, measuring cross-entropy reconstruction error and classification error.  相似文献   

8.
Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.  相似文献   

9.
张健  丁世飞  丁玲  张成龙 《软件学报》2021,32(12):3802-3813
受限玻尔兹曼机(restricted Boltzmann machine,简称RBM)是一种概率无向图,传统的RBM模型假设隐藏层单元是二值的,二值单元的优势在于计算过程和采样过程相对简单,然而二值化会对基于隐藏层单元的特征提取和数据重构过程带来信息损失.因此,将RBM的可见层单元和隐藏层单元实值化并保持模型训练的有效性,是目前RBM理论研究的重点问题.为了解决这个问题,将二值单元拓展为实值单元,利用实值单元建模数据并提取特征.具体而言,在可见层单元和隐藏层单元之间增加辅助单元,然后将图正则化项引入到能量函数中,基于二值辅助单元和图正则化项,流形上的数据有更高的概率被映射为参数化的截断高斯分布;同时,远离流形的数据有更高的概率被映射为高斯噪声.由此,模型的隐层单元可以被表示为参数化截断高斯分布或高斯噪声的采样实值.该模型称为基于辅助单元的受限玻尔兹曼机(restricted Boltzmann machine with auxiliary units,简称ARBM).在理论上分析了模型的有效性,然后构建了相应的深度模型,并通过实验验证模型在图像重构任务和图像生成任务中的有效性.  相似文献   

10.
Many real scenarios in machine learning are of dynamic nature. Learning in these types of environments represents an important challenge for learning systems. In this context, the model used for learning should work in real time and have the ability to act and react by itself, adjusting its controlling parameters, even its structures, depending on the requirements of the process. In a previous work, the authors presented an online learning algorithm for two-layer feedforward neural networks that includes a factor that weights the errors committed in each of the samples. This method is effective in dynamic environments as well as in stationary contexts. As regards this method’s incremental feature, we raise the possibility that the network topology is adapted according to the learning needs. In this paper, we demonstrate and justify the suitability of the online learning algorithm to work with adaptive structures without significantly degrading its performance. The theoretical basis for the method is given and its performance is illustrated by means of its application to different system identification problems. The results confirm that the proposed method is able to incorporate units to its hidden layer, during the learning process, without high performance degradation.  相似文献   

11.
There is no method to determine the optimal topology for multi-layer neural networks for a given problem. Usually the designer selects a topology for the network and then trains it. Since determination of the optimal topology of neural networks belongs to class of NP-hard problems, most of the existing algorithms for determination of the topology are approximate. These algorithms could be classified into four main groups: pruning algorithms, constructive algorithms, hybrid algorithms and evolutionary algorithms. These algorithms can produce near optimal solutions. Most of these algorithms use hill-climbing method and may be stuck at local minima. In this article, we first introduce a learning automaton and study its behaviour and then present an algorithm based on the proposed learning automaton, called survival algorithm, for determination of the number of hidden units of three layers neural networks. The survival algorithm uses learning automata as a global search method to increase the probability of obtaining the optimal topology. The algorithm considers the problem of optimization of the topology of neural networks as object partitioning rather than searching or parameter optimization as in existing algorithms. In survival algorithm, the training begins with a large network, and then by adding and deleting hidden units, a near optimal topology will be obtained. The algorithm has been tested on a number of problems and shown through simulations that networks generated are near optimal.  相似文献   

12.
Stacking Restricted Boltzmann Machines (RBM) to create deep networks, such as Deep Belief Networks (DBN) and Deep Boltzmann Machines (DBM), has become one of the most important research fields in deep learning. DBM and DBN provide state-of-the-art results in many fields such as image recognition, but they don't show better learning abilities than RBM when dealing with data containing irrelevant patterns. Point-wise Gated Restricted Boltzmann Machines (pgRBM) can effectively find the task-relevant patterns from data containing irrelevant patterns and thus achieve satisfied classification results. For the limitations of the DBN and the DBM in the processing of data containing irrelevant patterns, we introduce the pgRBM into the DBN and the DBM and present Point-wise Gated Deep Belief Networks (pgDBN) and Point-wise Gated Deep Boltzmann Machines (pgDBM). The pgDBN and the pgDBM both utilize the pgRBM instead of the RBM to pre-train the weights connecting the networks' the visible layer and the hidden layer, and apply the pgRBM learning task-relevant data subset for traditional networks. Then, this paper discusses the validity that dropout and weight uncertainty methods are developed to prevent overfitting in pgRBMs, pgDBNs, and pgDBMs networks. Experimental results on MNIST variation datasets show that the pgDBN and the pgDBM are effective deep neural networks learning  相似文献   

13.
《Advanced Robotics》2013,27(10):1215-1229
Reinforcement learning is the scheme for unsupervised learning in which robots are expected to acquire behavior skills through self-explorations based on reward signals. There are some difficulties, however, in applying conventional reinforcement learning algorithms to motion control tasks of a robot because most algorithms are concerned with discrete state space and based on the assumption of complete observability of the state. Real-world environments often have partial observablility; therefore, robots have to estimate the unobservable hidden states. This paper proposes a method to solve these two problems by combining the reinforcement learning algorithm and a learning algorithm for a continuous time recurrent neural network (CTRNN). The CTRNN can learn spatio-temporal structures in a continuous time and space domain, and can preserve the contextual flow by a self-organizing appropriate internal memory structure. This enables the robot to deal with the hidden state problem. We carried out an experiment on the pendulum swing-up task without rotational speed information. As a result, this task is accomplished in several hundred trials using the proposed algorithm. In addition, it is shown that the information about the rotational speed of the pendulum, which is considered as a hidden state, is estimated and encoded on the activation of a context neuron.  相似文献   

14.
A lattice Boltzmann (LB) framework to solve fluid flow control and optimisation problems numerically is presented. Problems are formulated on a mesoscopic basis. In a side condition, the dynamics of a Newtonian fluid is described by a family of simplified Boltzmann-like equations, namely BGK–Boltzmann equations, which are linked to an incompressible Navier–Stokes equation. It is proposed to solve the non-linear optimisation problem by a line search algorithm. The needed derivatives are obtained by deriving the adjoint equations, referred to as adjoint BGK–Boltzmann equations. The primal equations are discretised by standard lattice Boltzmann methods (LBM) while for the adjoint equations a novel discretisation strategy is introduced. The approach follows the main ideas behind LBM and is therefore referred to as adjoint lattice Boltzmann methods (ALBM). The corresponding algorithm retains most of the basic features of LB algorithms. In particular, it enables a highly-efficient parallel implementation and thus solving large-scale fluid flow control and optimisation problems. The overall solution strategy, the derivation of a prototype adjoint BGK–Boltzmann equation, the novel ALBM and its parallel realisation as well as its validation are discussed in detail in this article. Numerical and performance results are presented for a series of steady-state distributed control problems with up to approximately 1.6 million unknown control parameters obtained on a high performance computer with up to 256 processing units.  相似文献   

15.
Extreme learning machine (ELM) is widely used in complex industrial problems, especially the online-sequential extreme learning machine (OS-ELM) plays a good role in industrial online modeling. However, OS-ELM requires batch samples to be pre-trained to obtain initial weights, which may reduce the timeliness of samples. This paper proposes a novel model for the online process regression prediction, which is called the Recurrent Extreme Learning Machine (Recurrent-ELM). The nodes between the hidden layers are connected in Recurrent-ELM, thus the input of the hidden layer receives both the information from the current input layer and the previously hidden layer. Moreover, the weights and biases of the proposed model are generated by analysis rather than random. Six regression applications are used to verify the designed Recurrent-ELM, compared with extreme learning machine (ELM), fast learning network (FLN), online sequential extreme learning machine (OS-ELM), and an ensemble of online sequential extreme learning machine (EOS-ELM), the experimental results show that the Recurrent-ELM has better generalization and stability in several samples. In addition, to further test the performance of Recurrent-ELM, we employ it in the combustion modeling of a 330 MW coal-fired boiler compared with FLN, SVR and OS-ELM. The results show that Recurrent-ELM has better accuracy and generalization ability, and the theoretical model has some potential application value in practical application.  相似文献   

16.
强化学习是解决自适应问题的重要方法,被广泛地应用于连续状态下的学习控制,然而存在效率不高和收敛速度较慢的问题.在运用反向传播(back propagation,BP)神经网络基础上,结合资格迹方法提出一种算法,实现了强化学习过程的多步更新.解决了输出层的局部梯度向隐层节点的反向传播问题,从而实现了神经网络隐层权值的快速更新,并提供一个算法描述.提出了一种改进的残差法,在神经网络的训练过程中将各层权值进行线性优化加权,既获得了梯度下降法的学习速度又获得了残差梯度法的收敛性能,将其应用于神经网络隐层的权值更新,改善了值函数的收敛性能.通过一个倒立摆平衡系统仿真实验,对算法进行了验证和分析.结果显示,经过较短时间的学习,本方法能成功地控制倒立摆,显著提高了学习效率.  相似文献   

17.
The total organic carbon (TOC) content is a parameter that is directly used to evaluate the hydrocarbon generation capacity of a reservoir. For a reservoir, accurately calculating TOC using well logging curves is a problem that needs to be solved. Machine learning models usually yield the most accurate results. Problems of existing machine learning models that are applied to well logging interpretations include poor feature extraction methods and limited ability to learn complex functions. However, logging interpretation is a small sample problem, and traditional deep learning with strong feature extraction ability cannot be directly used; thus, a deep learning model suitable for logging small sample features, namely, a combination of unsupervised learning and semisupervised learning in an integrated DLM (IDLM), is proposed in this paper and is applied to the TOC prediction problem. This study is also the first systematic application of a deep learning model in a well logging interpretation. First, the model uses a stacked extreme learning machine sparse autoencoder (SELM-SAE) unsupervised learning method to perform coarse feature extraction for a large number of unlabeled samples, and a feature extraction layer consisting of multiple hidden layers is established. Then, the model uses the deep Boltzmann machine (DBM) semisupervised learning method to learn a large number of unlabeled samples and a small number of labeled samples (the input is extracted from logging curve values into SELM-SAE extracted features), and the SELM-SAE and DBM are integrated to form a deep learning model (DLM). Finally, multiple DLMs are combined to form an IDLM algorithm through an improved weighted bagging algorithm. A total of 2381 samples with an unlabeled logging response from 4 wells in 2 shale gas areas and 326 samples with determined TOC values are used to train the model. The model is compared with 11 other machine learning models, and the IDLM achieves the highest precision. Moreover, the simulation shows that for the TOC prediction problem, when the number of labeled samples included in the training is greater than 20, even if this number of samples is used to train 10 hidden layer IDLMs, the trained model has a very low overfitting probability and exhibits the potential to exceed the accuracies of other models. Relative to the existing mainstream shallow model, the IDLM based on a DLM provides the most advanced performance and is more effective. This method implements a small sample deep learning algorithm for TOC prediction and can feasibly use deep learning to solve logging interpretation problems and other small sample set problems for the first time. The IDLM achieves high precision and provides novel insights that can aid in oil and gas exploration and development.  相似文献   

18.
Cascade-correlation (Cascor) is a popular supervised learning architecture that dynamically grows layers of hidden neurons of fixed nonlinear activations (e.g., sigmoids), so that the network topology (size, depth) can be efficiently determined. Similar to a cascade-correlation learning network (CCLN), a projection pursuit learning network (PPLN) also dynamically grows the hidden neurons. Unlike a CCLN where cascaded connections from the existing hidden units to the new candidate hidden unit are required to establish high-order nonlinearity in approximating the residual error, a PPLN approximates the high-order nonlinearity by using trainable parametric or semi-parametric nonlinear smooth activations based on minimum mean squared error criterion. An analysis is provided to show that the maximum correlation training criterion used in a CCLN tends to produce hidden units that saturate and thus makes it more suitable for classification tasks instead of regression tasks as evidenced in the simulation results. It is also observed that this critical weakness in CCLN can also potentially carry over to classification tasks, such as the two-spiral benchmark used in the original CCLN paper.  相似文献   

19.
In order to find an appropriate architecture for a large-scale real-world application automatically and efficiently, a natural method is to divide the original problem into a set of subproblems. In this paper, we propose a simple neural-network task decomposition method based on output parallelism. By using this method, a problem can be divided flexibly into several subproblems as chosen, each of which is composed of the whole input vector and a fraction of the output vector. Each module (for one subproblem) is responsible for producing a fraction of the output vector of the original problem. The hidden structure for the original problem's output units are decoupled. These modules can be grown and trained in parallel on parallel processing elements. Incorporated with a constructive learning algorithm, our method does not require excessive computation and any prior knowledge concerning decomposition. The feasibility of output parallelism is analyzed and proved. Some benchmarks are implemented to test the validity of this method. Their results show that this method can reduce computational time, increase learning speed and improve generalization accuracy for both classification and regression problems.  相似文献   

20.
A certain assumption that appears in the proof of correctness of the standard Boltzmann machine learning procedure is investigated. The assumption, called the clamping assumption, concerns the behavior of a Boltzmann machine when some of its units are clamped to a fixed state. It is argued that the clamping assumption is essentially an assertion of the time reversibility of a certain Markov chain underlying the behavior of the Boltzmann machine. As such, the clamping assumption is generally false, though it is certainly true of the Boltzmann machines themselves. The author also considers how the concept of the Boltzmann machine may be generalized while retaining the validity of the clamping assumption.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号