期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mean-field variational approximate Bayesian inference for latent variable models

Guido Consonni 《Computational statistics & data analysis》2007,52(2):790-798

The ill-posed nature of missing variable models offers a challenging testing ground for new computational techniques. This is the case for the mean-field variational Bayesian inference. The behavior of this approach in the setting of the Bayesian probit model is illustrated. It is shown that the mean-field variational method always underestimates the posterior variance and, that, for small sample sizes, the mean-field variational approximation to the posterior location could be poor. 相似文献

2.

A Bayesian approach to model-based clustering for binary panel probit models

Christian Aßmann Jens Boysen-Hogrefe 《Computational statistics & data analysis》2011,55(1):261-279

Considering latent heterogeneity is of special importance in nonlinear models in order to gauge correctly the effect of explanatory variables on the dependent variable. A stratified model-based clustering approach is adapted for modeling latent heterogeneity in binary panel probit models. Within a Bayesian framework an estimation algorithm dealing with the inherent label switching problem is provided. Determination of the number of clusters is based on the marginal likelihood and a cross-validation approach. A simulation study is conducted to assess the ability of both approaches to determine on the correct number of clusters indicating high accuracy for the marginal likelihood criterion, with the cross-validation approach performing similarly well in most circumstances. Different concepts of marginal effects incorporating latent heterogeneity at different degrees arise within the considered model setup and are directly at hand within Bayesian estimation via MCMC methodology. An empirical illustration of the methodology developed indicates that consideration of latent heterogeneity via latent clusters provides the preferred model specification over a pooled and a random coefficient specification. 相似文献

3.

二值probit回归模型的坍缩变分贝叶斯推断算法

卿湘运王行愚牛玉刚《控制与决策》2008,23(5):589-592

给出了二值probit回归模型的坍缩变分贝叶斯推断算法.此算法比变分贝叶斯推断算法能更逼近对数边缘似然,得到更精确的模型参数后验期望值.如果两个算法得到的分类错误一致,则该算法的迭代次数较变分法明显减少.仿真实验结果验证了所提出算法的有效性. 相似文献

4.

Grid based variational approximations

John T. Ormerod 《Computational statistics & data analysis》2011,55(1):45-56

Variational methods for approximate Bayesian inference provide fast, flexible, deterministic alternatives to Monte Carlo methods. Unfortunately, unlike Monte Carlo methods, variational approximations cannot, in general, be made to be arbitrarily accurate. This paper develops grid-based variational approximations which endeavor to approximate marginal posterior densities in a spirit similar to the Integrated Nested Laplace Approximation (INLA) of Rue et al. (2009) but which may be applied in situations where INLA cannot be used. The method can greatly increase the accuracy of a base variational approximation, although not in general to arbitrary accuracy. The methodology developed is at least reasonably accurate on all of the examples considered in the paper. 相似文献

5.

Upper bound for variational free energy of Bayesian networks

Kazuho Watanabe Motoki Shiga Sumio Watanabe 《Machine Learning》2009,75(2):199-215

In recent years, variational Bayesian learning has been used as an approximation of Bayesian learning. In spite of the computational tractability and good generalization in many applications, its statistical properties have yet to be clarified. In this paper, we focus on variational Bayesian learning of Bayesian networks which are widely used in information processing and uncertain artificial intelligence. We derive upper bounds for asymptotic variational free energy or stochastic complexities of bipartite Bayesian networks with discrete hidden variables. Our result theoretically supports the effectiveness of variational Bayesian learning as an approximation of Bayesian learning. 相似文献

6.

A novel view of the variational Bayesian clustering

Takashi Tomoki 《Neurocomputing》2009,72(13-15):3366

We prove that the evaluation function of variational Bayesian (VB) clustering algorithms can be described as the log likelihood of given data minus the Kullback–Leibler (KL) divergence between the prior and the posterior of model parameters. In this novel formalism of VB, the evaluation functions can be explicitly interpreted as information criteria for model selection and the KL divergence imposes a heavy penalty on the posterior far from the prior. We derive the update process of the variational Bayesian clustering with finite mixture Student's t-distribution, taking the penalty term for the degree of freedoms into account. 相似文献

7.

Bayesian analysis of finite mixtures of multinomial and negative-multinomial distributions

M.J. Rufo C.J. Pérez 《Computational statistics & data analysis》2007,51(11):5452-5466

The Bayesian implementation of finite mixtures of distributions has been an area of considerable interest within the literature. Computational advances on approximation techniques such as Markov chain Monte Carlo (MCMC) methods have been a keystone to Bayesian analysis of mixture models. This paper deals with the Bayesian analysis of finite mixtures of two particular types of multidimensional distributions: the multinomial and the negative-multinomial ones. A unified framework addressing the main topics in a Bayesian analysis is developed for the case with a known number of component distributions. In particular, theoretical results and algorithms to solve the label-switching problem are provided. An illustrative example is presented to show that the proposed techniques are easily applied in practice. 相似文献

8.

Efficient inferencing for sigmoid Bayesian networks by reducing sampling space

Young S. Han Young C. Park Key-Sun Choi 《Applied Intelligence》1996,6(4):275-285

A sigmoid Bayesian network is a Bayesian network in which a conditional probability is a sigmoid function of the weights of relevant arcs. Its application domain includes that of Boltzmann machine as well as traditional decision problems. In this paper we show that the node reduction method that is an inferencing algorithm for general Bayesian networks can also be used on sigmoid Bayesian networks, and we propose a hybrid inferencing method combining the node reduction and Gibbs sampling. The time efficiency of sampling after node reduction is demonstrated through experiments. The results of this paper bring sigmoid Bayesian networks closer to large scale applications. 相似文献

9.

Variational approximations in Bayesian model selection for finite mixture distributions

C.A. McGrory D.M. Titterington 《Computational statistics & data analysis》2007,51(11):5352-5367

Variational methods, which have become popular in the neural computing/machine learning literature, are applied to the Bayesian analysis of mixtures of Gaussian distributions. It is also shown how the deviance information criterion, (DIC), can be extended to these types of model by exploiting the use of variational approximations. The use of variational methods for model selection and the calculation of a DIC are illustrated with real and simulated data. The variational approach allows the simultaneous estimation of the component parameters and the model complexity. It is found that initial selection of a large number of components results in superfluous components being eliminated as the method converges to a solution. This corresponds to an automatic choice of model complexity. The appropriateness of this is reflected in the DIC values. 相似文献

10.

Bayesian estimation of Dirichlet mixture model with variational inference

Zhanyu Ma Pravin Kumar Rana Jalil Taghia Markus Flierl Arne Leijon 《Pattern recognition》2014

In statistical modeling, parameter estimation is an essential and challengeable task. Estimation of the parameters in the Dirichlet mixture model (DMM) is analytically intractable, due to the integral expressions of the gamma function and its corresponding derivatives. We introduce a Bayesian estimation strategy to estimate the posterior distribution of the parameters in DMM. By assuming the gamma distribution as the prior to each parameter, we approximate both the prior and the posterior distribution of the parameters with a product of several mutually independent gamma distributions. The extended factorized approximation method is applied to introduce a single lower-bound to the variational objective function and an analytically tractable estimation solution is derived. Moreover, there is only one function that is maximized during iterations and, therefore, the convergence of the proposed algorithm is theoretically guaranteed. With synthesized data, the proposed method shows the advantages over the EM-based method and the previously proposed Bayesian estimation method. With two important multimedia signal processing applications, the good performance of the proposed Bayesian estimation method is demonstrated. 相似文献

11.

Bayesian hybrid generative discriminative learning based on finite Liouville mixture models

Nizar Bouguila Author Vitae 《Pattern recognition》2011,44(6):1183-1200

Recently hybrid generative discriminative approaches have emerged as an efficient knowledge representation and data classification engine. However, little attention has been devoted to the modeling and classification of non-Gaussian and especially proportional vectors. Our main goal, in this paper, is to discover the true structure of this kind of data by building probabilistic kernels from generative mixture models based on Liouville family, from which we develop the Beta-Liouville distribution, and which includes the well-known Dirichlet as a special case. The Beta-Liouville has a more general covariance structure than the Dirichlet which makes it more practical and useful. Our learning technique is based on a principled purely Bayesian approach which resulted models are used to generate support vector machine (SVM) probabilistic kernels based on information divergence. In particular, we show the existence of closed-form expressions of the Kullback-Leibler and Rényi divergences between two Beta-Liouville distributions and then between two Dirichlet distributions as a special case. Through extensive simulations and a number of experiments involving synthetic data, visual scenes and texture images classification, we demonstrate the effectiveness of the proposed approaches. 相似文献

12.

Latent classification models for binary data

Helge Langseth Author Vitae Thomas D. Nielsen^{Author Vitae} 《Pattern recognition》2009,42(11):2724-2736

One of the simplest, and yet most consistently well-performing set of classifiers is the naïve Bayes models (a special class of Bayesian network models). However, these models rely on the (naïve) assumption that all the attributes used to describe an instance are conditionally independent given the class of that instance. To relax this independence assumption, we have in previous work proposed a family of models, called latent classification models (LCMs). LCMs are defined for continuous domains and generalize the naïve Bayes model by using latent variables to model class-conditional dependencies between the attributes. In addition to providing good classification accuracy, the LCM has several appealing properties, including a relatively small parameter space making it less susceptible to over-fitting. In this paper we take a first step towards generalizing LCMs to hybrid domains, by proposing an LCM for domains with binary attributes. We present algorithms for learning the proposed model, and we describe a variational approximation-based inference procedure. Finally, we empirically compare the accuracy of the proposed model to the accuracy of other classifiers for a number of different domains, including the problem of recognizing symbols in black and white images. 相似文献

13.

Non-parametric Bayesian estimation for multitype branching processes through simulation-based methods

M. González M. Mota 《Computational statistics & data analysis》2008,52(3):1281-1291

The problem of statistical inference from a Bayesian outlook is studied for the multitype Galton-Watson branching process, considering a non-parametric framework. The only data assumed to be available are each generation's population size vectors. The Gibbs sampler is used in estimating the posterior distributions of the main parameters of the model, and the predictive distributions for as yet unobserved generations. The algorithm provided is independent of whether the process becomes extinct or not. The method is illustrated with simulated examples. 相似文献

14.

Robust monitoring and fault reconstruction based on variational inference component analysis

Zhiqiang Ge Zhihuan Song 《Journal of Process Control》2011,21(4):462-474

Probabilistic models such as probabilistic principal component analysis (PPCA) have recently caught much attention in the process monitoring area. An important issue of the PPCA method is how to determine the dimensionality of the latent variable space. In the present paper, one of the most popular Bayesian type chemometric methods, Bayesian PCA (BPCA) is introduced for process monitoring purpose, which is based on the recent developed variational inference algorithm. In this monitoring framework, the effectiveness of each extracted latent variable can be well reflected by a hyperparameter, upon which the dimensionality of the latent variable space can be automatically determined. Meanwhile, for practical consideration, the developed BPCA-based monitoring method is robust to missing data and can also give satisfactory performance under limited data samples. Another contribution of this paper is due to the proposal of a new fault reconstruction method under the BPCA model structure. Two case studies are provided to evaluate the performance of the proposed method. 相似文献

15.

GARCH dependence in extreme value models with Bayesian inference

Xin ZhaoCarl John Scarrott Les OxleyMarco Reale 《Mathematics and computers in simulation》2011,81(7):1430-1440

Extreme value methods are widely used in financial applications such as risk analysis, forecasting and pricing models. One of the challenges with their application in finance is accounting for the temporal dependence between the observations, for example the stylised fact that financial time series exhibit volatility clustering. Various approaches have been proposed to capture the dependence. Commonly a two-stage approach is taken, where the volatility dependence is removed using a volatility model like a GARCH (or one of its many incarnations) followed by application of standard extreme value models to the assumed independent residual innovations.This study examines an alternative one stage approach, which makes parameter estimation and accounting for the associated uncertainties more straightforward than the two-stage approach. The location and scale parameters of the extreme value distribution are defined to follow a conditional autoregressive heteroscedasticity process. Essentially, the model implements GARCH volatility via the extreme value model parameters. Bayesian inference is used and implemented via Markov chain Monte Carlo, to permit all sources of uncertainty to be accounted for. The model is applied to both simulated and empirical data to demonstrate performance in extrapolating the extreme quantiles and quantifying the associated uncertainty. 相似文献

16.

Variational Bayesian method for speech enhancement

Qinghua Jie Yue 《Neurocomputing》2007,70(16-18):3063

In this paper, we propose to use variational Bayesian (VB) method to learn the clean speech signal from noisy observation directly. It models the probability distribution of clean signal using a Gaussian mixture model (GMM) and minimizes the misfit between the true probability distributions of hidden variables and model parameters and their approximate distributions. Experimental results demonstrate that the performance of the proposed algorithm is better than that of some other methods. 相似文献

17.

用于因果分析的混合贝叶斯网络结构学习 总被引：2，自引：0，他引：2

王双成李小琳侯彩虹《智能系统学报》2007,2(6):82-89

目前主要结合扩展的熵离散化方法和打分一搜索方法进行混合贝叶斯网络结构学习，算法效率和可靠性低，而且易于陷入局部最优结构。针对问题建立了一种新的混合贝叶斯网络结构迭代学习方法．在迭代中，基于父结点结构和Gibbs sampling进行混合数据聚类，实现对连续变量的离散化，再结合贝叶斯网络结构优化调整，使贝叶斯网络结构序列逐渐趋于稳定，可避免使用扩展的熵离散化和打分——搜索所带来的主要问题．相似文献

18.

A Sequential Monte Carlo Method for Bayesian Analysis of Massive Datasets

Greg Ridgeway David Madigan 《Data mining and knowledge discovery》2003,7(3):301-319

Markov chain Monte Carlo (MCMC) techniques revolutionized statistical practice in the 1990s by providing an essential toolkit for making the rigor and flexibility of Bayesian analysis computationally practical. At the same time the increasing prevalence of massive datasets and the expansion of the field of data mining has created the need for statistically sound methods that scale to these large problems. Except for the most trivial examples, current MCMC methods require a complete scan of the dataset for each iteration eliminating their candidacy as feasible data mining techniques.In this article we present a method for making Bayesian analysis of massive datasets computationally feasible. The algorithm simulates from a posterior distribution that conditions on a smaller, more manageable portion of the dataset. The remainder of the dataset may be incorporated by reweighting the initial draws using importance sampling. Computation of the importance weights requires a single scan of the remaining observations. While importance sampling increases efficiency in data access, it comes at the expense of estimation efficiency. A simple modification, based on the rejuvenation step used in particle filters for dynamic systems models, sidesteps the loss of efficiency with only a slight increase in the number of data accesses.To show proof-of-concept, we demonstrate the method on two examples. The first is a mixture of transition models that has been used to model web traffic and robotics. For this example we show that estimation efficiency is not affected while offering a 99% reduction in data accesses. The second example applies the method to Bayesian logistic regression and yields a 98% reduction in data accesses. 相似文献

19.

混合贝叶斯网络隐藏变量学习研究 总被引：6，自引：0，他引：6

王双成《计算机学报》2005,28(9):1564-1569

目前,具有已知结构的隐藏变量学习主要针对具有离散变量的贝叶斯网和具有连续变量的高斯网．该文给出了具有连续和离散变量的混合贝叶斯网络隐藏变量学习方法．该方法不需要离散化连续变量,依据专业知识或贝叶斯网络道德图中Cliques的维数发现隐藏变量的位置,基于依赖结构（星形结构或先验结构）和Gibbs抽样确定隐藏变量的值,结合扩展的MDL标准和统计方法发现隐藏变量的最优维数．实验结果表明,这种方法能够有效地进行具有已知结构的混合贝叶斯网络隐藏变量学习．相似文献

20.

Variational Bayesian methods for spatial data analysis

Qian Ren Andrew O. Finley James S. Hodges 《Computational statistics & data analysis》2011,55(12):3197-3217

With scientific data available at geocoded locations, investigators are increasingly turning to spatial process models for carrying out statistical inference. However, fitting spatial models often involves expensive matrix decompositions, whose computational complexity increases in cubic order with the number of spatial locations. This situation is aggravated in Bayesian settings where such computations are required once at every iteration of the Markov chain Monte Carlo (MCMC) algorithms. In this paper, we describe the use of Variational Bayesian (VB) methods as an alternative to MCMC to approximate the posterior distributions of complex spatial models. Variational methods, which have been used extensively in Bayesian machine learning for several years, provide a lower bound on the marginal likelihood, which can be computed efficiently. We provide results for the variational updates in several models especially emphasizing their use in multivariate spatial analysis. We demonstrate estimation and model comparisons from VB methods by using simulated data as well as environmental data sets and compare them with inference from MCMC. 相似文献