首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
In this work, a variational Bayesian framework for efficient training of echo state networks (ESNs) with automatic regularization and delay&sum (D&S) readout adaptation is proposed. The algorithm uses a classical batch learning of ESNs. By treating the network echo states as fixed basis functions parameterized with delay parameters, we propose a variational Bayesian ESN training scheme. The variational approach allows for a seamless combination of sparse Bayesian learning ideas and a variational Bayesian space-alternating generalized expectation-maximization (VB-SAGE) algorithm for estimating parameters of superimposed signals. While the former method realizes automatic regularization of ESNs, which also determines which echo states and input signals are relevant for "explaining" the desired signal, the latter method provides a basis for joint estimation of D&S readout parameters. The proposed training algorithm can naturally be extended to ESNs with fixed filter neurons. It also generalizes the recently proposed expectation-maximization-based D&S readout adaptation method. The proposed algorithm was tested on synthetic data prediction tasks as well as on dynamic handwritten character recognition.  相似文献   

2.
Husmeier D 《Neural computation》2000,12(11):2685-2717
Training probability-density estimating neural networks with the expectation-maximization (EM) algorithm aims to maximize the likelihood of the training set and therefore leads to overfitting for sparse data. In this article, a regularization method for mixture models with generalized linear kernel centers is proposed, which adopts the Bayesian evidence approach and optimizes the hyperparameters of the prior by type II maximum likelihood. This includes a marginalization over the parameters, which is done by Laplace approximation and requires the derivation of the Hessian of the log-likelihood function. The incorporation of this approach into the standard training scheme leads to a modified form of the EM algorithm, which includes a regularization term and adapts the hyperparameters on-line after each EM cycle. The article presents applications of this scheme to classification problems, the prediction of stochastic time series, and latent space models.  相似文献   

3.
Variational Bayes learning or mean field approximation is widely used in statistical models which are made of mixtures of exponential distributions, for example, normal mixtures, binomial mixtures, and hidden Markov models. To derive variational Bayes learning algorithm, we need to determine the hyperparameters in the a priori distribution; however, the design method of hyperparameters has not yet been established. In the present paper, we propose two different design methods of hyperparameters which are applied to the different purposes. In the former method, the hyperparameter is determined for minimization of the generalization error. In the latter method, it is chosen so that candidates of hidden structure in training data are extracted. It is experimentally shown that the optimal hyperparameters for two purposes are different from each other.  相似文献   

4.
This work concentrates on not only probing into a novel Bayesian probabilistic model to formulate a general type of robust multiple measurement vectors sparse signal recovery problem with impulsive noise, but also developing an improved variational Bayesian method to recover the original joint row sparse signals. In the design of the model, two three-level hierarchical Bayesian estimation procedures are designed to characterize impulsive noise and joint row sparse source signals by means of Gaussian scale mixtures and multivariate generalized t distribution. Those hidden variables, included in signal and measurement models are estimated based on a variational Bayesian framework, in which multiple kinds of probability distributions are adopted to express their features. In the design of the algorithm, the proposed algorithm is a full Bayesian inference approach related to variational Bayesian estimation. It is robust to impulsive noise, since the posterior distribution estimation can be effectively approached through estimating unknown parameters. Extensive simulation results show that the proposed algorithm significantly outperforms the compared robust sparse signal recovery approaches under different kinds of impulsive noises.  相似文献   

5.
This paper presents a variational Bayes expectation maximization algorithm for time series based on Attias? variational Bayesian theory. The proposed algorithm is applied in the blind source separation (BSS) problem to estimate both the source signals and the mixing matrix for the optimal model structure. The distribution of the mixing matrix is assumed to be a matrix Gaussian distribution due to the correlation of its elements and the inverse covariance of the sensor noise is assumed to be Wishart distributed for the correlation between sensor noises. The mixture of Gaussian model is used to approximate the distribution of each independent source. The rules to update the posterior hyperparameters and the posterior of the model structure are obtained. The optimal model structure is selected as the one with largest posterior. The source signals and mixing matrix are estimated by applying LMS and MAP estimators to the posterior distributions of the hidden variables and the model parameters respectively for the optimal structure. The proposed algorithm is tested with synthetic data. The results show that: (1) the logarithm posterior of the model structure increases with the accuracy of the posterior mixing matrix; (2) the accuracies of the prior mixing matrix, the estimated mixing matrix, and the estimated source signals increase with the logarithm posterior of the model structure. This algorithm is applied to Magnetoencephalograph data to localize the source of the equivalent current dipoles.  相似文献   

6.
Recursive Bayesian Recurrent Neural Networks for Time-Series Modeling   总被引:3,自引:0,他引:3  
This paper develops a probabilistic approach to recursive second-order training of recurrent neural networks (RNNs) for improved time-series modeling. A general recursive Bayesian Levenberg–Marquardt algorithm is derived to sequentially update the weights and the covariance (Hessian) matrix. The main strengths of the approach are a principled handling of the regularization hyperparameters that leads to better generalization, and stable numerical performance. The framework involves the adaptation of a noise hyperparameter and local weight prior hyperparameters, which represent the noise in the data and the uncertainties in the model parameters. Experimental investigations using artificial and real-world data sets show that RNNs equipped with the proposed approach outperform standard real-time recurrent learning and extended Kalman training algorithms for recurrent networks, as well as other contemporary nonlinear neural models, on time-series modeling.   相似文献   

7.
In statistical image segmentation, the distribution of pixel values is usually assumed to be Gaussian and the optimal result is believed to be the one that has maximum a posteriori (MAP) probability. In spite of its prevalence and computational efficiency, the Gaussian assumption, however, is not always strictly followed, and hence may lead to less accurate results. Although the variational Bayes inference (VBI), in which statistical model parameters are also assumed to be random variables, has been widely used, it can hardly handle the spatial information embedded in pixels. In this paper, we incorporate spatial smoothness constraints on pixels labels interpreted by the Markov random field (MRF) model into the VBI process, and thus propose a novel statistical model called VBI-MRF for image segmentation. We evaluated our algorithm against the variational expectation-maximization (VEM) algorithm and the hidden Markov random field (HMRF) model and MAP-MRF model based algorithms on both noise-corrupted synthetic images and mosaics of natural texture. Our pilot results suggest that the proposed algorithm can segment images more accurately than other three methods and is capable of producing robust image segmentation.  相似文献   

8.
Gradient-based optimization of hyperparameters   总被引:3,自引:0,他引:3  
Bengio Y 《Neural computation》2000,12(8):1889-1900
Many machine learning algorithms can be formulated as the minimization of a training criterion that involves a hyperparameter. This hyperparameter is usually chosen by trial and error with a model selection criterion. In this article we present a methodology to optimize several hyperparameters, based on the computation of the gradient of a model selection criterion with respect to the hyperparameters. In the case of a quadratic training criterion, the gradient of the selection criterion with respect to the hyperparameters is efficiently computed by backpropagating through a Cholesky decomposition. In the more general case, we show that the implicit function theorem can be used to derive a formula for the hyperparameter gradient involving second derivatives of the training criterion.  相似文献   

9.
蒋云良  赵康  曹军杰  范婧  刘勇 《控制与决策》2021,36(8):1825-1833
近年来随着深度学习尤其是深度强化学习模型的不断增大,其训练成本即超参数的搜索空间也在不断变大,然而传统超参数搜索算法大部分是基于顺序执行训练,往往需要等待数周甚至数月才有可能找到较优的超参数配置.为解决深度强化学习超参数搜索时间长和难以找到较优超参数配置问题,提出一种新的超参数搜索算法-----基于种群演化的超参数异步并行搜索(PEHS).算法结合演化算法思想,利用固定资源预算异步并行搜索种群模型及其超参数,从而提高算法性能.设计实现在Ray并行分布式框架上运行的参数搜索算法,通过实验表明在并行框架上基于种群演化的超参数异步并行搜索的效果优于传统超参数搜索算法,且性能稳定.  相似文献   

10.
In this article, a new denoising algorithm is proposed based on the directionlet transform and the maximum a posteriori (MAP) estimation. The detailed directionlet coefficients of the logarithmically transformed noise-free image are considered to be Gaussian mixture probability density functions (PDFs) with zero means, and the speckle noise in the directionlet domain is modelled as additive noise with a Gaussian distribution. Then, we develop a Bayesian MAP estimator using these assumed prior distributions. Because the estimator that is the solution of the MAP equation is a function of the parameters of the assumed mixture PDF models, the expectation-maximization (EM) algorithm is also utilized to estimate the parameters, including weight factors and variances. Finally, the noise-free SAR image is restored from the estimated coefficients yielded by the MAP estimator. Experimental results show that the directionlet-based MAP method can be successfully applied to images and real synthetic aperture radar images to denoise speckle.  相似文献   

11.
Adaptive sparseness for supervised learning   总被引:14,自引:0,他引:14  
The goal of supervised learning is to infer a functional mapping based on a set of training examples. To achieve good generalization, it is necessary to control the "complexity" of the learned function. In Bayesian approaches, this is done by adopting a prior for the parameters of the function being learned. We propose a Bayesian approach to supervised learning, which leads to sparse solutions; that is, in which irrelevant parameters are automatically set exactly to zero. Other ways to obtain sparse classifiers (such as Laplacian priors, support vector machines) involve (hyper)parameters which control the degree of sparseness of the resulting classifiers; these parameters have to be somehow adjusted/estimated from the training data. In contrast, our approach does not involve any (hyper)parameters to be adjusted or estimated. This is achieved by a hierarchical-Bayes interpretation of the Laplacian prior, which is then modified by the adoption of a Jeffreys' noninformative hyperprior. Implementation is carried out by an expectation-maximization (EM) algorithm. Experiments with several benchmark data sets show that the proposed approach yields state-of-the-art performance. In particular, our method outperforms SVMs and performs competitively with the best alternative techniques, although it involves no tuning or adjustment of sparseness-controlling hyperparameters.  相似文献   

12.
This paper presents a new model based on statistical and variational methods for non-rigid image registration. It can be viewed as an improvement of the intensity-based model whose dissimilarity term is based on minimization of the so-called sum of squared difference(SSD). In the proposed model, it is assumed that the residue of two images can be described as a mixture of Gaussian distributions. Then we incorporate the features of variational regularization methods and expectation-maximization(EM) algorithm, and propose the new model. The novelty is the introduction of two weighting functions and some control parameters in dissimilarity term. The weighting functions could identify low and high contrast objects of the residue automatically and effectively, and the control parameters help to improve the robustness of the model to the choice of regularization parameters. By the introduced parameters and weighting functions, the algorithm could locally adjust the behavior of deformation in different contrast regions. Numerical experimental results of 2D synthetic and 3D MR brain images demonstrate the efficiency and accuracy of the proposed approach compared with other methods.  相似文献   

13.
Qinghua  Jie  Yue   《Neurocomputing》2007,70(16-18):3063
In this paper, we propose to use variational Bayesian (VB) method to learn the clean speech signal from noisy observation directly. It models the probability distribution of clean signal using a Gaussian mixture model (GMM) and minimizes the misfit between the true probability distributions of hidden variables and model parameters and their approximate distributions. Experimental results demonstrate that the performance of the proposed algorithm is better than that of some other methods.  相似文献   

14.
Variational Bayesian extreme learning machine   总被引:1,自引:0,他引:1  
Extreme learning machine (ELM) randomly generates parameters of hidden nodes and then analytically determines the output weights with fast learning speed. The ill-posed problem of parameter matrix of hidden nodes directly causes unstable performance, and the automatical selection problem of the hidden nodes is critical to holding the high efficiency of ELM. Focusing on the ill-posed problem and the automatical selection problem of the hidden nodes, this paper proposes the variational Bayesian extreme learning machine (VBELM). First, the Bayesian probabilistic model is involved into ELM, where the Bayesian prior distribution can avoid the ill-posed problem of hidden node matrix. Then, the variational approximation inference is employed in the Bayesian model to compute the posterior distribution and the independent variational hyperparameters approximately, which can be used to select the hidden nodes automatically. Theoretical analysis and experimental results elucidate that VBELM has stabler performance with more compact architectures, which presents probabilistic predictions comparison with traditional point predictions, and it also provides the hyperparameter criterion for hidden node selection.  相似文献   

15.
曲寒冰  陈曦  王松涛  于明 《自动化学报》2015,41(8):1482-1494
本文建立了两个点集线性匹配过程的贝叶斯模型框架,并利用变分贝叶斯逼近方法对模型点集到场景点集的仿射参数进行估计。该模型利用一个有向图对映射参数、隐藏变量、模型与场景点集的关系进行了描述,并基于有向图给出了各个参数和变量后验概率的迭代估计算法。而且该模型还利用了一个带有各向异性协方差矩阵的高斯模型对场景点集的离群点进行了估计和推理。实验结果表明该模型在鲁棒性和匹配精度方面均获得了良好的效果。  相似文献   

16.
We propose a new method for general Gaussian kernel hyperparameter optimization for support vector machines classification. The hyperparameters are constrained to lie on a differentiable manifold. The proposed optimization technique is based on a gradient-like descent algorithm adapted to the geometrical structure of the manifold of symmetric positive-definite matrices. We compare the performance of our approach with the classical support vector machine for classification and with other methods of the state of the art on toy data and on real world data sets.  相似文献   

17.
On classification with incomplete data   总被引:4,自引:0,他引:4  
We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both expectation-maximization (EM) and variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graph-based regularization. The semisupervised algorithm utilizes all available data-both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown  相似文献   

18.
Bayesian Gaussian process classification with the EM-EP algorithm   总被引:1,自引:0,他引:1  
Gaussian process classifiers (GPCs) are Bayesian probabilistic kernel classifiers. In GPCs, the probability of belonging to a certain class at an input location is monotonically related to the value of some latent function at that location. Starting from a Gaussian process prior over this latent function, data are used to infer both the posterior over the latent function and the values of hyperparameters to determine various aspects of the function. Recently, the expectation propagation (EP) approach has been proposed to infer the posterior over the latent function. Based on this work, we present an approximate EM algorithm, the EM-EP algorithm, to learn both the latent function and the hyperparameters. This algorithm is found to converge in practice and provides an efficient Bayesian framework for learning hyperparameters of the kernel. A multiclass extension of the EM-EP algorithm for GPCs is also derived. In the experimental results, the EM-EP algorithms are as good or better than other methods for GPCs or support vector machines (SVMs) with cross-validation  相似文献   

19.
角度空间损失函数往往因需要手动调节超参数而引起算法训练的不稳定,类别标签数量的不同也将导致算法的移植性较差。针对这些问题,提出一种带有下界判断的自适应角度空间损失函数并应用于人脸识别。该方法以假设人脸表达特征分布在超球体空间为切入点,通过分析不同超参数对训练结果的影响,使预测概率公式的二阶导数为零并动态地计算当前mini-batch角度分布的去尾平均数; 为了提高算法的可移植性,根据类别中心的最小期望后验概率给出自适应调节超参数的下界。通过在LFW和MegaFace百万级人脸数据集上进行算法评估,证明提出的方法可以有效地提高人脸识别精度以及模型收敛率,在亚洲人脸数据集上的实验证明该方法具有较好的鲁棒性与移植性。  相似文献   

20.
Blind source separation (BSS) has attained much attention in signal processing society due to its ‘blind’ property and wide applications. However, there are still some open problems, such as underdetermined BSS, noise BSS. In this paper, we propose a Bayesian approach to improve the separation performance of instantaneous mixtures with non-stationary sources by taking into account the internal organization of the non-stationary sources. Gaussian mixture model (GMM) is used to model the distribution of source signals and the continuous density hidden Markov model (CDHMM) is derived to track the non-stationarity inside the source signals. Source signals can switch between several states such that the separation performance can be significantly improved. An expectation-maximization (EM) algorithm is derived to estimate the mixing coefficients, the CDHMM parameters and the noise covariance. The source signals are recovered via maximum a posteriori (MAP) approach. To ensure the convergence of the proposed algorithm, the proper prior densities, conjugate prior densities, are assigned to estimation coefficients for incorporating the prior information. The initialization scheme for the estimates is also discussed. Systematic simulations are used to illustrate the performance of the proposed algorithm. Simulation results show that the proposed algorithm has more robust separation performance in terms of similarity score in noise environments in comparison with the classical BSS algorithms in determined mixture case. Additionally, since the mixing matrix and the sources are estimated jointly, the proposed EM algorithm also works well in underdetermined case. Furthermore, the proposed algorithm converges quickly with proper initialization.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号