共查询到12条相似文献,搜索用时 0 毫秒
1.
Association Models for Web Mining 总被引:3,自引:0,他引:3
We describe how statistical association models and, specifically, graphical models, can be usefully employed to model web mining data. We describe some methodological problems related to the implementation of discrete graphical models for web mining data. In particular, we discuss model selection procedures. 相似文献
2.
Ingelin Steinsland 《Computational statistics & data analysis》2007,51(6):2969-2981
Markov chain Monte Carlo algorithms are computationally expensive for large models. Especially, the so-called one-block Metropolis-Hastings (M-H) algorithm demands large computational resources, and parallel computing seems appealing. A parallel one-block M-H algorithm for latent Gaussian Markov random field (GMRF) models is introduced. Important parts of this algorithm are parallel exact sampling and evaluation of GMRFs. Parallelisation is achieved with parallel algorithms from linear algebra for sparse symmetric positive definite matrices. The parallel GMRF sampler is tested for GMRFs on lattices and irregular graphs, and gives both good speed-up and good scalability. The parallel one-block M-H algorithm is used to make inference for a geostatistical GMRF model with a latent spatial field of 31,500 variables. 相似文献
3.
Hierarchical Bayesian modelling is considered for the number of age-dependent deaths in different geographic regions. The model uses a conditional binomial distribution for the number of age-dependent deaths, a new family of zero mean Gaussian Markov random field models for incorporating spatial correlations between neighbouring regions, and an intrinsic Gaussian model for including correlations between age-dependent mortality rates. Age-dependent mortality rates are estimated for each region, and approximate credibility intervals based on summaries of samples from the posterior distribution are obtained from Markov chain Monte Carlo simulation. The consequent maps of mortality rates are less variable and smoother than those which would be obtained from naive estimates, and various inferences may be drawn from the results. The prior spatial model includes some of the common conditional autoregressive spatial models used in epidemiology, and so model uncertainty in this family can be accounted for. The methodology is illustrated with an actuarial data set of age-dependent deaths in 150 geographic regions of Hungary. Sensitivity to the prior distributions is discussed, as well as relative risks for certain covariates (males in towns, females in towns, males in villages, females in villages). 相似文献
4.
Based on a semiparametric Bayesian framework, a joint-quantile regression method is developed for analyzing clustered data, where random effects are included to accommodate the intra-cluster dependence. Instead of posing any parametric distributional assumptions on the random errors, the proposed method approximates the central density by linearly interpolating the conditional quantile functions of the response at multiple quantiles and estimates the tail densities by adopting extreme value theory. Through joint-quantile modeling, the proposed algorithm can yield the joint posterior distribution of quantile coefficients at multiple quantiles and meanwhile avoid the quantile crossing issue. The finite sample performance of the proposed method is assessed through a simulation study and the analysis of an apnea duration data. 相似文献
5.
One of the main problems in operational risk management is the lack of loss data, which affects the parameter estimates of the marginal distributions of the losses. The principal reason is that financial institutions only started to collect operational loss data a few years ago, due to the relatively recent definition of this type of risk. Considering this drawback, the employment of Bayesian methods and simulation tools could be a natural solution to the problem. The use of Bayesian methods allows us to integrate the scarce and, sometimes, inaccurate quantitative data collected by the bank with prior information provided by experts. An original proposal is a Bayesian approach for modelling operational risk and for calculating the capital required to cover the estimated risks. Besides this methodological innovation a computational scheme, based on Markov chain Monte Carlo simulations, is required. In particular, the application of the MCMC method to estimate the parameters of the marginals shows advantages in terms of a reduction of capital charge according to different choices of the marginal loss distributions. 相似文献
6.
7.
Various models for time series of counts which can account for discreteness, overdispersion and serial correlation are compared. Besides observation- and parameter-driven models based upon corresponding conditional Poisson distributions, a dynamic ordered probit model as a flexible specification to capture the salient features of time series of counts is also considered. For all models, appropriate efficient estimation procedures are presented. For the parameter-driven specification this requires Monte-Carlo procedures like simulated maximum likelihood or Markov chain Monte Carlo. The methods, including corresponding diagnostic tests, are illustrated using data on daily admissions for asthma to a single hospital. Estimation results turn out to be remarkably similar across the different models. 相似文献
8.
Parameter distribution estimation has long been a hot issue for the uncertainty quantification of environmental models. Traditional approaches such as MCMC (Markov Chain Monte Carlo) are prohibitive to be applied to large complex dynamic models because of the high computational cost of computing resources. To reduce the number of model evaluations required, we proposed an adaptive surrogate modeling-based sampling strategy for parameter distribution estimation, named ASMO-PODE (Adaptive Surrogate Modeling-based Optimization – Parameter Optimization and Distribution Estimation). The ASMO-PODE can provide an estimation of the parameter distribution using as little as one percent of the model evaluations required by a regular MCMC approach. The effectiveness and efficiency of the ASMO-PODE approach have been evaluated with 2 test problems and one land surface model, the Common Land Model. The results demonstrated that the ASMO-PODE method is an economic way for parameter optimization and distribution estimation. 相似文献
9.
In this paper we present the results of a simulation study to explore the ability of Bayesian parametric and nonparametric models to provide an adequate fit to count data of the type that would routinely be analyzed parametrically either through fixed-effects or random-effects Poisson models. The context of the study is a randomized controlled trial with two groups (treatment and control). Our nonparametric approach uses several modeling formulations based on Dirichlet process priors. We find that the nonparametric models are able to flexibly adapt to the data, to offer rich posterior inference, and to provide, in a variety of settings, more accurate predictive inference than parametric models. 相似文献
10.
A simple test for threshold nonlinearity in either the mean or volatility equation, or both, of a heteroskedastic time series model is proposed. The procedure extends current Bayesian Markov chain Monte Carlo methods and threshold modelling by employing a general double threshold GARCH model that allows for an explosive, non-stationary regime. Posterior credible intervals on model parameters are used to detect and specify threshold nonlinearity in the mean and/or volatility equations. Simulation experiments demonstrate that the method works favorably in identifying model specifications varying in complexity from the conventional GARCH up to the full double-threshold nonlinear GARCH model with an explosive regime, and is robust to over-specification in model orders. 相似文献
11.
The assumption of proportional hazards (PH) fundamental to the Cox PH model sometimes may not hold in practice. In this paper, we propose a generalization of the Cox PH model in terms of the cumulative hazard function taking a form similar to the Cox PH model, with the extension that the baseline cumulative hazard function is raised to a power function. Our model allows for interaction between covariates and the baseline hazard and it also includes, for the two sample problem, the case of two Weibull distributions and two extreme value distributions differing in both scale and shape parameters. The partial likelihood approach can not be applied here to estimate the model parameters. We use the full likelihood approach via a cubic B-spline approximation for the baseline hazard to estimate the model parameters. A semi-automatic procedure for knot selection based on Akaike’s information criterion is developed. We illustrate the applicability of our approach using real-life data. 相似文献