共查询到20条相似文献,搜索用时 0 毫秒
1.
Guido Consonni 《Computational statistics & data analysis》2007,52(2):790-798
The ill-posed nature of missing variable models offers a challenging testing ground for new computational techniques. This is the case for the mean-field variational Bayesian inference. The behavior of this approach in the setting of the Bayesian probit model is illustrated. It is shown that the mean-field variational method always underestimates the posterior variance and, that, for small sample sizes, the mean-field variational approximation to the posterior location could be poor. 相似文献
2.
Considering latent heterogeneity is of special importance in nonlinear models in order to gauge correctly the effect of explanatory variables on the dependent variable. A stratified model-based clustering approach is adapted for modeling latent heterogeneity in binary panel probit models. Within a Bayesian framework an estimation algorithm dealing with the inherent label switching problem is provided. Determination of the number of clusters is based on the marginal likelihood and a cross-validation approach. A simulation study is conducted to assess the ability of both approaches to determine on the correct number of clusters indicating high accuracy for the marginal likelihood criterion, with the cross-validation approach performing similarly well in most circumstances. Different concepts of marginal effects incorporating latent heterogeneity at different degrees arise within the considered model setup and are directly at hand within Bayesian estimation via MCMC methodology. An empirical illustration of the methodology developed indicates that consideration of latent heterogeneity via latent clusters provides the preferred model specification over a pooled and a random coefficient specification. 相似文献
3.
4.
John T. Ormerod 《Computational statistics & data analysis》2011,55(1):45-56
Variational methods for approximate Bayesian inference provide fast, flexible, deterministic alternatives to Monte Carlo methods. Unfortunately, unlike Monte Carlo methods, variational approximations cannot, in general, be made to be arbitrarily accurate. This paper develops grid-based variational approximations which endeavor to approximate marginal posterior densities in a spirit similar to the Integrated Nested Laplace Approximation (INLA) of Rue et al. (2009) but which may be applied in situations where INLA cannot be used. The method can greatly increase the accuracy of a base variational approximation, although not in general to arbitrary accuracy. The methodology developed is at least reasonably accurate on all of the examples considered in the paper. 相似文献
5.
In recent years, variational Bayesian learning has been used as an approximation of Bayesian learning. In spite of the computational
tractability and good generalization in many applications, its statistical properties have yet to be clarified. In this paper,
we focus on variational Bayesian learning of Bayesian networks which are widely used in information processing and uncertain
artificial intelligence. We derive upper bounds for asymptotic variational free energy or stochastic complexities of bipartite
Bayesian networks with discrete hidden variables. Our result theoretically supports the effectiveness of variational Bayesian
learning as an approximation of Bayesian learning. 相似文献
6.
We prove that the evaluation function of variational Bayesian (VB) clustering algorithms can be described as the log likelihood of given data minus the Kullback–Leibler (KL) divergence between the prior and the posterior of model parameters. In this novel formalism of VB, the evaluation functions can be explicitly interpreted as information criteria for model selection and the KL divergence imposes a heavy penalty on the posterior far from the prior. We derive the update process of the variational Bayesian clustering with finite mixture Student's t-distribution, taking the penalty term for the degree of freedoms into account. 相似文献
7.
The Bayesian implementation of finite mixtures of distributions has been an area of considerable interest within the literature. Computational advances on approximation techniques such as Markov chain Monte Carlo (MCMC) methods have been a keystone to Bayesian analysis of mixture models. This paper deals with the Bayesian analysis of finite mixtures of two particular types of multidimensional distributions: the multinomial and the negative-multinomial ones. A unified framework addressing the main topics in a Bayesian analysis is developed for the case with a known number of component distributions. In particular, theoretical results and algorithms to solve the label-switching problem are provided. An illustrative example is presented to show that the proposed techniques are easily applied in practice. 相似文献
8.
A sigmoid Bayesian network is a Bayesian network in which a conditional probability is a sigmoid function of the weights of relevant arcs. Its application domain includes that of Boltzmann machine as well as traditional decision problems. In this paper we show that the node reduction method that is an inferencing algorithm for general Bayesian networks can also be used on sigmoid Bayesian networks, and we propose a hybrid inferencing method combining the node reduction and Gibbs sampling. The time efficiency of sampling after node reduction is demonstrated through experiments. The results of this paper bring sigmoid Bayesian networks closer to large scale applications. 相似文献
9.
Variational methods, which have become popular in the neural computing/machine learning literature, are applied to the Bayesian analysis of mixtures of Gaussian distributions. It is also shown how the deviance information criterion, (DIC), can be extended to these types of model by exploiting the use of variational approximations. The use of variational methods for model selection and the calculation of a DIC are illustrated with real and simulated data. The variational approach allows the simultaneous estimation of the component parameters and the model complexity. It is found that initial selection of a large number of components results in superfluous components being eliminated as the method converges to a solution. This corresponds to an automatic choice of model complexity. The appropriateness of this is reflected in the DIC values. 相似文献
10.
In statistical modeling, parameter estimation is an essential and challengeable task. Estimation of the parameters in the Dirichlet mixture model (DMM) is analytically intractable, due to the integral expressions of the gamma function and its corresponding derivatives. We introduce a Bayesian estimation strategy to estimate the posterior distribution of the parameters in DMM. By assuming the gamma distribution as the prior to each parameter, we approximate both the prior and the posterior distribution of the parameters with a product of several mutually independent gamma distributions. The extended factorized approximation method is applied to introduce a single lower-bound to the variational objective function and an analytically tractable estimation solution is derived. Moreover, there is only one function that is maximized during iterations and, therefore, the convergence of the proposed algorithm is theoretically guaranteed. With synthesized data, the proposed method shows the advantages over the EM-based method and the previously proposed Bayesian estimation method. With two important multimedia signal processing applications, the good performance of the proposed Bayesian estimation method is demonstrated. 相似文献
11.
Nizar Bouguila Author Vitae 《Pattern recognition》2011,44(6):1183-1200
Recently hybrid generative discriminative approaches have emerged as an efficient knowledge representation and data classification engine. However, little attention has been devoted to the modeling and classification of non-Gaussian and especially proportional vectors. Our main goal, in this paper, is to discover the true structure of this kind of data by building probabilistic kernels from generative mixture models based on Liouville family, from which we develop the Beta-Liouville distribution, and which includes the well-known Dirichlet as a special case. The Beta-Liouville has a more general covariance structure than the Dirichlet which makes it more practical and useful. Our learning technique is based on a principled purely Bayesian approach which resulted models are used to generate support vector machine (SVM) probabilistic kernels based on information divergence. In particular, we show the existence of closed-form expressions of the Kullback-Leibler and Rényi divergences between two Beta-Liouville distributions and then between two Dirichlet distributions as a special case. Through extensive simulations and a number of experiments involving synthetic data, visual scenes and texture images classification, we demonstrate the effectiveness of the proposed approaches. 相似文献
12.
Helge Langseth Author Vitae Thomas D. Nielsen Author Vitae 《Pattern recognition》2009,42(11):2724-2736
One of the simplest, and yet most consistently well-performing set of classifiers is the naïve Bayes models (a special class of Bayesian network models). However, these models rely on the (naïve) assumption that all the attributes used to describe an instance are conditionally independent given the class of that instance. To relax this independence assumption, we have in previous work proposed a family of models, called latent classification models (LCMs). LCMs are defined for continuous domains and generalize the naïve Bayes model by using latent variables to model class-conditional dependencies between the attributes. In addition to providing good classification accuracy, the LCM has several appealing properties, including a relatively small parameter space making it less susceptible to over-fitting. In this paper we take a first step towards generalizing LCMs to hybrid domains, by proposing an LCM for domains with binary attributes. We present algorithms for learning the proposed model, and we describe a variational approximation-based inference procedure. Finally, we empirically compare the accuracy of the proposed model to the accuracy of other classifiers for a number of different domains, including the problem of recognizing symbols in black and white images. 相似文献
13.
The problem of statistical inference from a Bayesian outlook is studied for the multitype Galton-Watson branching process, considering a non-parametric framework. The only data assumed to be available are each generation's population size vectors. The Gibbs sampler is used in estimating the posterior distributions of the main parameters of the model, and the predictive distributions for as yet unobserved generations. The algorithm provided is independent of whether the process becomes extinct or not. The method is illustrated with simulated examples. 相似文献
14.
Probabilistic models such as probabilistic principal component analysis (PPCA) have recently caught much attention in the process monitoring area. An important issue of the PPCA method is how to determine the dimensionality of the latent variable space. In the present paper, one of the most popular Bayesian type chemometric methods, Bayesian PCA (BPCA) is introduced for process monitoring purpose, which is based on the recent developed variational inference algorithm. In this monitoring framework, the effectiveness of each extracted latent variable can be well reflected by a hyperparameter, upon which the dimensionality of the latent variable space can be automatically determined. Meanwhile, for practical consideration, the developed BPCA-based monitoring method is robust to missing data and can also give satisfactory performance under limited data samples. Another contribution of this paper is due to the proposal of a new fault reconstruction method under the BPCA model structure. Two case studies are provided to evaluate the performance of the proposed method. 相似文献
15.
Xin ZhaoCarl John Scarrott Les OxleyMarco Reale 《Mathematics and computers in simulation》2011,81(7):1430-1440
Extreme value methods are widely used in financial applications such as risk analysis, forecasting and pricing models. One of the challenges with their application in finance is accounting for the temporal dependence between the observations, for example the stylised fact that financial time series exhibit volatility clustering. Various approaches have been proposed to capture the dependence. Commonly a two-stage approach is taken, where the volatility dependence is removed using a volatility model like a GARCH (or one of its many incarnations) followed by application of standard extreme value models to the assumed independent residual innovations.This study examines an alternative one stage approach, which makes parameter estimation and accounting for the associated uncertainties more straightforward than the two-stage approach. The location and scale parameters of the extreme value distribution are defined to follow a conditional autoregressive heteroscedasticity process. Essentially, the model implements GARCH volatility via the extreme value model parameters. Bayesian inference is used and implemented via Markov chain Monte Carlo, to permit all sources of uncertainty to be accounted for. The model is applied to both simulated and empirical data to demonstrate performance in extrapolating the extreme quantiles and quantifying the associated uncertainty. 相似文献
16.
In this paper, we propose to use variational Bayesian (VB) method to learn the clean speech signal from noisy observation directly. It models the probability distribution of clean signal using a Gaussian mixture model (GMM) and minimizes the misfit between the true probability distributions of hidden variables and model parameters and their approximate distributions. Experimental results demonstrate that the performance of the proposed algorithm is better than that of some other methods. 相似文献
17.
18.
Markov chain Monte Carlo (MCMC) techniques revolutionized statistical practice in the 1990s by providing an essential toolkit for making the rigor and flexibility of Bayesian analysis computationally practical. At the same time the increasing prevalence of massive datasets and the expansion of the field of data mining has created the need for statistically sound methods that scale to these large problems. Except for the most trivial examples, current MCMC methods require a complete scan of the dataset for each iteration eliminating their candidacy as feasible data mining techniques.In this article we present a method for making Bayesian analysis of massive datasets computationally feasible. The algorithm simulates from a posterior distribution that conditions on a smaller, more manageable portion of the dataset. The remainder of the dataset may be incorporated by reweighting the initial draws using importance sampling. Computation of the importance weights requires a single scan of the remaining observations. While importance sampling increases efficiency in data access, it comes at the expense of estimation efficiency. A simple modification, based on the rejuvenation step used in particle filters for dynamic systems models, sidesteps the loss of efficiency with only a slight increase in the number of data accesses.To show proof-of-concept, we demonstrate the method on two examples. The first is a mixture of transition models that has been used to model web traffic and robotics. For this example we show that estimation efficiency is not affected while offering a 99% reduction in data accesses. The second example applies the method to Bayesian logistic regression and yields a 98% reduction in data accesses. 相似文献
19.
混合贝叶斯网络隐藏变量学习研究 总被引:6,自引:0,他引:6
目前,具有已知结构的隐藏变量学习主要针对具有离散变量的贝叶斯网和具有连续变量的高斯网.该文给出了具有连续和离散变量的混合贝叶斯网络隐藏变量学习方法.该方法不需要离散化连续变量,依据专业知识或贝叶斯网络道德图中Cliques的维数发现隐藏变量的位置,基于依赖结构(星形结构或先验结构)和Gibbs抽样确定隐藏变量的值,结合扩展的MDL标准和统计方法发现隐藏变量的最优维数.实验结果表明,这种方法能够有效地进行具有已知结构的混合贝叶斯网络隐藏变量学习. 相似文献
20.
Qian Ren Andrew O. Finley James S. Hodges 《Computational statistics & data analysis》2011,55(12):3197-3217
With scientific data available at geocoded locations, investigators are increasingly turning to spatial process models for carrying out statistical inference. However, fitting spatial models often involves expensive matrix decompositions, whose computational complexity increases in cubic order with the number of spatial locations. This situation is aggravated in Bayesian settings where such computations are required once at every iteration of the Markov chain Monte Carlo (MCMC) algorithms. In this paper, we describe the use of Variational Bayesian (VB) methods as an alternative to MCMC to approximate the posterior distributions of complex spatial models. Variational methods, which have been used extensively in Bayesian machine learning for several years, provide a lower bound on the marginal likelihood, which can be computed efficiently. We provide results for the variational updates in several models especially emphasizing their use in multivariate spatial analysis. We demonstrate estimation and model comparisons from VB methods by using simulated data as well as environmental data sets and compare them with inference from MCMC. 相似文献