首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Accurate information on patterns of introduction and spread of non-native species is essential for making predictions and management decisions. In many cases, estimating unknown rates of introduction and spread from observed data requires evaluating intractable variable-dimensional integrals. In general, inference on the large class of models containing latent variables of large or variable dimension precludes the use of exact sampling techniques. Approximate Bayesian computation (ABC) methods provide an alternative to exact sampling but rely on inefficient conditional simulation of the latent variables. To accomplish this task efficiently, a new transdimensional Monte Carlo sampler is developed for approximate Bayesian model inference and used to estimate rates of introduction and spread for the non-native earthworm species Dendrobaena octaedra (Savigny) along roads in the boreal forest of northern Alberta. Using low and high estimates of introduction and spread rates, the extent of earthworm invasions in northeastern Alberta is simulated to project the proportion of suitable habitat invaded in the year following data collection.  相似文献   

2.
In many applications, it is of interest to simultaneously cluster row and column variables in a data set, identifying local subgroups within a data matrix that share some common characteristic. When a small set of variables is believed to be associated with a set of responses, block clustering or biclustering is a more appropriate technique to use compared to one-dimensional clustering. A flexible framework for Bayesian model-based block clustering, that can determine multiple block clusters in a data matrix through a novel and efficient evolutionary Monte Carlo-based methodology, is proposed. The performance of this methodology is illustrated through a number of simulation studies and an application to data from genome-wide association studies.  相似文献   

3.
Markov chain Monte Carlo (MCMC) algorithms have greatly facilitated the popularity of Bayesian variable selection and model averaging in problems with high-dimensional covariates where enumeration of the model space is infeasible. A variety of such algorithms have been proposed in the literature for sampling models from the posterior distribution in Bayesian variable selection. Ghosh and Clyde proposed a method to exploit the properties of orthogonal design matrices. Their data augmentation algorithm scales up the computation tremendously compared to traditional Gibbs samplers, and leads to the availability of Rao-Blackwellized estimates of quantities of interest for the original non-orthogonal problem. The algorithm has excellent performance when the correlations among the columns of the design matrix are small, but empirical results suggest that moderate to strong multicollinearity leads to slow mixing. This motivates the need to develop a class of novel sandwich algorithms for Bayesian variable selection that improves upon the algorithm of Ghosh and Clyde. It is proved that the Haar algorithm with the largest group that acts on the space of models is the optimum algorithm, within the parameter expansion data augmentation (PXDA) class of sandwich algorithms. The result provides theoretical insight but using the largest group is computationally prohibitive so two new computationally viable sandwich algorithms are developed, which are inspired by the Haar algorithm, but do not necessarily belong to the class of PXDA algorithms. It is illustrated via simulation studies and real data analysis that several of the sandwich algorithms can offer substantial gains in the presence of multicollinearity.  相似文献   

4.
Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a non-deterministic polynomial...  相似文献   

5.
The Bayesian neural networks are useful tools to estimate the functional structure in the nonlinear systems. However, they suffer from some complicated problems such as controlling the model complexity, the training time, the efficient parameter estimation, the random walk, and the stuck in the local optima in the high-dimensional parameter cases. In this paper, to alleviate these mentioned problems, a novel hybrid Bayesian learning procedure is proposed. This approach is based on the full Bayesian learning, and integrates Markov chain Monte Carlo procedures with genetic algorithms and the fuzzy membership functions. In the application sections, to examine the performance of proposed approach, nonlinear time series and regression analysis are handled separately, and it is compared with the traditional training techniques in terms of their estimation and prediction abilities.  相似文献   

6.
The outer layers of the Earth’s atmosphere are known as the ionosphere, a plasma of free electrons and positively charged atomic ions. The electron density of the ionosphere varies considerably with time of day, season, geographical location and the sun’s activity. Maps of electron density are required because local changes in this density can produce inaccuracies in the Navy Navigation Satellite System (NNSS) and Global Positioning System (GPS). Satellite to ground based receiver measurements produce tomographic information about the density in the form of path integrated snapshots of the total electron content which must be inverted to generate electron density maps. A Bayesian approach is proposed for solving the inversion problem using spatial priors in a parsimonious model for the variation of electron density with height. The Bayesian approach to modelling and inference provides estimates of electron density along with a measure of uncertainty for these estimates, leading to credible intervals for all quantities of interest. The standard parameterisation does not lend itself well to standard Metropolis-Hastings algorithms. A much more efficient form of Markov chain Monte Carlo sampler is developed using a transformation of variables based on a principal components analysis of initial output.  相似文献   

7.
Markov chain Monte Carlo (MCMC) techniques revolutionized statistical practice in the 1990s by providing an essential toolkit for making the rigor and flexibility of Bayesian analysis computationally practical. At the same time the increasing prevalence of massive datasets and the expansion of the field of data mining has created the need for statistically sound methods that scale to these large problems. Except for the most trivial examples, current MCMC methods require a complete scan of the dataset for each iteration eliminating their candidacy as feasible data mining techniques.In this article we present a method for making Bayesian analysis of massive datasets computationally feasible. The algorithm simulates from a posterior distribution that conditions on a smaller, more manageable portion of the dataset. The remainder of the dataset may be incorporated by reweighting the initial draws using importance sampling. Computation of the importance weights requires a single scan of the remaining observations. While importance sampling increases efficiency in data access, it comes at the expense of estimation efficiency. A simple modification, based on the rejuvenation step used in particle filters for dynamic systems models, sidesteps the loss of efficiency with only a slight increase in the number of data accesses.To show proof-of-concept, we demonstrate the method on two examples. The first is a mixture of transition models that has been used to model web traffic and robotics. For this example we show that estimation efficiency is not affected while offering a 99% reduction in data accesses. The second example applies the method to Bayesian logistic regression and yields a 98% reduction in data accesses.  相似文献   

8.
Neural networks provide a tool for describing non-linearity in volatility processes of financial data and help to answer the question “how much” non-linearity is present in the data. Non-linearity is studied under three different specifications of the conditional distribution: Gaussian, Student-t and mixture of Gaussians. To rank the volatility models, a Bayesian framework is adopted to perform a Bayesian model selection within the different classes of models. In the empirical analysis, the return series of the Dow Jones Industrial Average index, FTSE 100 and NIKKEI 225 indices over a period of 16 years are studied. The results show different behavior across the three markets. In general, if a statistical model accounts for non-normality and explains most of the fat tails in the conditional distribution, then there is less need for complex non-linear specifications.  相似文献   

9.
While latent variable models have been successfully applied in many fields and underpin various modeling techniques, their ability to incorporate categorical responses is hindered due to the lack of accurate and efficient estimation methods. Approximation procedures, such as penalized quasi-likelihood, are computationally efficient, but the resulting estimators can be seriously biased for binary responses. Gauss-Hermite quadrature and Markov Chain Monte Carlo (MCMC) integration based methods can yield more accurate estimation, but they are computationally much more intensive. Estimation methods that can achieve both computational efficiency and estimation accuracy are still under development. This paper proposes an efficient direct sampling based Monte Carlo EM algorithm (DSMCEM) for latent variable models with binary responses. Mixed effects and item factor analysis models with binary responses are used to illustrate this algorithm. Results from two simulation studies and a real data example suggest that, as compared with MCMC based EM, DSMCEM can significantly improve computational efficiency as well as produce equally accurate parameter estimates. Other aspects and extensions of the algorithm are discussed.  相似文献   

10.
This work presents the current state-of-the-art in techniques for tracking a number of objects moving in a coordinated and interacting fashion. Groups are structured objects characterized with particular motion patterns. The group can be comprised of a small number of interacting objects (e.g. pedestrians, sport players, convoy of cars) or of hundreds or thousands of components such as crowds of people. The group object tracking is closely linked with extended object tracking but at the same time has particular features which differentiate it from extended objects. Extended objects, such as in maritime surveillance, are characterized by their kinematic states and their size or volume. Both group and extended objects give rise to a varying number of measurements and require trajectory maintenance. An emphasis is given here to sequential Monte Carlo (SMC) methods and their variants. Methods for small groups and for large groups are presented, including Markov Chain Monte Carlo (MCMC) methods, the random matrices approach and Random Finite Set Statistics methods. Efficient real-time implementations are discussed which are able to deal with the high dimensionality and provide high accuracy. Future trends and avenues are traced.  相似文献   

11.
In this paper we present the results of a simulation study to explore the ability of Bayesian parametric and nonparametric models to provide an adequate fit to count data of the type that would routinely be analyzed parametrically either through fixed-effects or random-effects Poisson models. The context of the study is a randomized controlled trial with two groups (treatment and control). Our nonparametric approach uses several modeling formulations based on Dirichlet process priors. We find that the nonparametric models are able to flexibly adapt to the data, to offer rich posterior inference, and to provide, in a variety of settings, more accurate predictive inference than parametric models.  相似文献   

12.
In the context of nonparametric Bayesian estimation a Markov chain Monte Carlo algorithm is devised and implemented to sample from the posterior distribution of the drift function of a continuously or discretely observed one-dimensional diffusion. The drift is modeled by a scaled linear combination of basis functions with a Gaussian prior on the coefficients. The scaling parameter is equipped with a partially conjugate prior. The number of basis functions in the drift is equipped with a prior distribution as well. For continuous data, a reversible jump Markov chain algorithm enables the exploration of the posterior over models of varying dimension. Subsequently, it is explained how data-augmentation can be used to extend the algorithm to deal with diffusions observed discretely in time. Some examples illustrate that the method can give satisfactory results. In these examples a comparison is made with another existing method as well.  相似文献   

13.
With scientific data available at geocoded locations, investigators are increasingly turning to spatial process models for carrying out statistical inference. However, fitting spatial models often involves expensive matrix decompositions, whose computational complexity increases in cubic order with the number of spatial locations. This situation is aggravated in Bayesian settings where such computations are required once at every iteration of the Markov chain Monte Carlo (MCMC) algorithms. In this paper, we describe the use of Variational Bayesian (VB) methods as an alternative to MCMC to approximate the posterior distributions of complex spatial models. Variational methods, which have been used extensively in Bayesian machine learning for several years, provide a lower bound on the marginal likelihood, which can be computed efficiently. We provide results for the variational updates in several models especially emphasizing their use in multivariate spatial analysis. We demonstrate estimation and model comparisons from VB methods by using simulated data as well as environmental data sets and compare them with inference from MCMC.  相似文献   

14.
Causal knowledge based on causal analysis can advance the quality of decision-making and thereby facilitate a process of transforming strategic objectives into effective actions. Several creditable studies have emphasized the usefulness of causal analysis techniques. Partial least squares (PLS) path modeling is one of several popular causal analysis techniques. However, one difficulty often faced when we commence research is that the causal direction is unknown due to the lack of background knowledge. To solve this difficulty, this paper proposes a method that links the Bayesian network and PLS path modeling for causal analysis. An empirical study is presented to illustrate the application of the proposed method. Based on the findings of this study, conclusions and implications for management are discussed.  相似文献   

15.
Monte Carlo (MC) methods are widely used for Bayesian inference and optimization in statistics, signal processing and machine learning. A well-known class of MC methods are Markov Chain Monte Carlo (MCMC) algorithms. In order to foster better exploration of the state space, specially in high-dimensional applications, several schemes employing multiple parallel MCMC chains have been recently introduced. In this work, we describe a novel parallel interacting MCMC scheme, called orthogonal MCMC (O-MCMC), where a set of “vertical” parallel MCMC chains share information using some “horizontal” MCMC techniques working on the entire population of current states. More specifically, the vertical chains are led by random-walk proposals, whereas the horizontal MCMC techniques employ independent proposals, thus allowing an efficient combination of global exploration and local approximation. The interaction is contained in these horizontal iterations. Within the analysis of different implementations of O-MCMC, novel schemes in order to reduce the overall computational cost of parallel Multiple Try Metropolis (MTM) chains are also presented. Furthermore, a modified version of O-MCMC for optimization is provided by considering parallel Simulated Annealing (SA) algorithms. Numerical results show the advantages of the proposed sampling scheme in terms of efficiency in the estimation, as well as robustness in terms of independence with respect to initial values and the choice of the parameters.  相似文献   

16.
Association Models for Web Mining   总被引:3,自引:0,他引:3  
We describe how statistical association models and, specifically, graphical models, can be usefully employed to model web mining data. We describe some methodological problems related to the implementation of discrete graphical models for web mining data. In particular, we discuss model selection procedures.  相似文献   

17.
Learning spatial models from sensor data raises the challenging data association problem of relating model parameters to individual measurements. This paper proposes an EM-based algorithm, which solves the model learning and the data association problem in parallel. The algorithm is developed in the context of the the structure from motion problem, which is the problem of estimating a 3D scene model from a collection of image data. To accommodate the spatial constraints in this domain, we compute virtual measurements as sufficient statistics to be used in the M-step. We develop an efficient Markov chain Monte Carlo sampling method called chain flipping, to calculate these statistics in the E-step. Experimental results show that we can solve hard data association problems when learning models of 3D scenes, and that we can do so efficiently. We conjecture that this approach can be applied to a broad range of model learning problems from sensordata, such as the robot mapping problem.  相似文献   

18.
Excessive pollutant discharge from multi-pollution resources can lead to a rise in downriver contaminant concentration in river segments. A multi-pollution source water quality model (MPSWQM) was integrated with Bayesian statistics to develop a robust method for supporting load (I) reduction and effective water quality management in the Harbin City Reach of the Songhua River system in northeastern China. The monthly water quality data observed during the period 2005–2010 was analyzed and compared, using ammonia as the study variable. The decay rate (k) was considered a key factor in the MPSWQM, and the distribution curve of k was estimated for the whole year. The distribution curves indicated small differences between the marginal distribution of k of each period and that water quality management strategies can be designed seasonally. From the curves, decision makers could pick up key posterior values of k in each month to attain the water quality goal at any specified time. Such flexibility is an effective way to improve the robustness of water quality management. For understanding the potential collinearity of k and I, a sensitivity test of k for I2i (loadings in segment 2 of the study river) was done under certain water quality goals. It indicated that the posterior distributions of I2i show seasonal variation and are sensitive to the marginal posteriors of k. Thus, the seasonal posteriors of k were selected according to the marginal distributions and used to estimate I2i in next water quality management. All kinds of pollutant sources, including polluted branches, point and non-point source, can be identified for multiple scenarios. The analysis enables decision makers to assess the influence of each loading and how best to manage water quality targets in each period. Decision makers can also visualize potential load reductions under different water quality goals. The results show that the proposed method is robust for management of multi-pollutant loadings under different water quality goals to help ensure that the water quality of river segments meets targeted goals.  相似文献   

19.
Based on a semiparametric Bayesian framework, a joint-quantile regression method is developed for analyzing clustered data, where random effects are included to accommodate the intra-cluster dependence. Instead of posing any parametric distributional assumptions on the random errors, the proposed method approximates the central density by linearly interpolating the conditional quantile functions of the response at multiple quantiles and estimates the tail densities by adopting extreme value theory. Through joint-quantile modeling, the proposed algorithm can yield the joint posterior distribution of quantile coefficients at multiple quantiles and meanwhile avoid the quantile crossing issue. The finite sample performance of the proposed method is assessed through a simulation study and the analysis of an apnea duration data.  相似文献   

20.
卿湘运  王行愚 《计算机学报》2007,30(8):1333-1343
子空间聚类的目标是在不同的特征子集上对给定的一组数据归类.此非监督学习方法试图发现数据"在不同表达下的相似"模式,并且引起了相关领域大量的关注和研究.首先扩展Hoff提出的"均值与方差平移"模型为一个新的基于特征子集的非参数聚类模型,其优点是能应用变分贝叶斯方法学习模型参数.此模型结合Dirichlet过程混合模型和选择特征子集的非参数模型,能自动选择聚类个数和进行子空间聚类.然后给出基于马尔可夫链蒙特卡罗的参数后验推断算法.出于计算速度上的考虑,提出应用变分贝叶斯方法学习模型参数.在仿真数据上的实验结果及在人脸聚类问题上的应用均表明了此模型能同时选择相关特征和在这些特征上具有相似模式的数据点.在UCI"多特征数据库"上应用无需抽样的变分贝叶斯方法,其实验结果说明此方法能快速推断模型参数.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号