首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Medical laboratory data are often censored, due to limitations of the measuring technology. For pharmacokinetics measurements and dilution-based assays, for example, there is a lower quantification limit, which depends on the type of assay used. The concentration of HIV particles in the plasma is subject to both lower and upper quantification limit. Linear and nonlinear mixed effects models, which are often used in these types of medical applications, need to be able to deal with such data issues. In this paper we discuss a hybrid Monte Carlo and numerical integration EM algorithm for computing the maximum likelihood estimates for linear and non-linear mixed models with censored data. Our implementation uses an efficient block-sampling scheme, automated monitoring of convergence, and dimension reduction based on the QR decomposition. For clusters with up to two censored observations numerical integration is used instead of Monte Carlo simulation. These improvements lead to a several-fold reduction in computation time. We illustrate the algorithm using data from an HIV/AIDS trial. The Monte Carlo EM is evaluated and compared with existing methods via a simulation study.  相似文献   

2.
3.
Simulation smoothing involves drawing state variables (or innovations) in discrete time state-space models from their conditional distribution given parameters and observations. Gaussian simulation smoothing is of particular interest, not only for the direct analysis of Gaussian linear models, but also for the indirect analysis of more general models. Several methods for Gaussian simulation smoothing exist, most of which are based on the Kalman filter. Since states in Gaussian linear state-space models are Gaussian Markov random fields, it is also possible to apply the Cholesky Factor Algorithm (CFA) to draw states. This algorithm takes advantage of the band diagonal structure of the Hessian matrix of the log density to make efficient draws. We show how to exploit the special structure of state-space models to draw latent states even more efficiently. We analyse the computational efficiency of Kalman-filter-based methods, the CFA, and our new method using counts of operations and computational experiments. We show that for many important cases, our method is most efficient. Gains are particularly large for cases where the dimension of observed variables is large or where one makes repeated draws of states for the same parameter values. We apply our method to a multivariate Poisson model with time-varying intensities, which we use to analyse financial market transaction count data.  相似文献   

4.
Nonlinear mixed-effects (NLME) models are widely used for longitudinal data analyses. Time-dependent covariates are often introduced to partially explain inter-individual variation. These covariates often have missing data, and the missingness may be nonignorable. Likelihood inference for NLME models with nonignorable missing data in time-varying covariates can be computationally very intensive and may even offer computational difficulties such as nonconvergence. We propose a computationally very efficient method for approximate likelihood inference. The method is illustrated using a real data example.  相似文献   

5.
The analysis of incomplete longitudinal data requires joint modeling of the longitudinal outcomes (observed and unobserved) and the response indicators. When non-response does not depend on the unobserved outcomes, within a likelihood framework, the missingness is said to be ignorable, obviating the need to formally model the process that drives it. For the non-ignorable or non-random case, estimation is less straightforward, because one must work with the observed data likelihood, which involves integration over the missing values, thereby giving rise to computational complexity, especially for high-dimensional missingness. The stochastic EM algorithm is a variation of the expectation-maximization (EM) algorithm and is particularly useful in cases where the E (expectation) step is intractable. Under the stochastic EM algorithm, the E-step is replaced by an S-step, in which the missing data are simulated from an appropriate conditional distribution. The method is appealing due to its computational simplicity. The SEM algorithm is used to fit non-random models for continuous longitudinal data with monotone or non-monotone missingness, using simulated, as well as case study, data. Resulting SEM estimates are compared with their direct likelihood counterparts wherever possible.  相似文献   

6.
Quantile regression problems in practice may require flexible semiparametric forms of the predictor for modeling the dependence of responses on covariates. Furthermore, it is often necessary to add random effects accounting for overdispersion caused by unobserved heterogeneity or for correlation in longitudinal data. We present a unified approach for Bayesian quantile inference on continuous response via Markov chain Monte Carlo (MCMC) simulation and approximate inference using integrated nested Laplace approximations (INLA) in additive mixed models. Different types of covariate are all treated within the same general framework by assigning appropriate Gaussian Markov random field (GMRF) priors with different forms and degrees of smoothness. We applied the approach to extensive simulation studies and a Munich rental dataset, showing that the methods are also computationally efficient in problems with many covariates and large datasets.  相似文献   

7.
MonteQueue is a new public-domain software package which rapidly solves large and small multiclass product-form queueing networks with multiple- and single-server stations over a wide range of traffic conditions. MonteQueue obtains estimates of performance measures by applying importance sampling to sum and integral representations of the network's normalization constants. This paper discusses the implementation issues and surveys of the theoretical properties of the four importance sampling techniques included in MonteQueue. It also presents new numerical data which compare the performance of the four techniques.  相似文献   

8.
The statistical analysis of mixed effects models for binary and count data is investigated. In the statistical computing environment R, there are a few packages that estimate models of this kind. The package lme4 is a de facto standard for mixed effects models. The package glmmML allows non-normal distributions in the specification of random intercepts. It also allows for the estimation of a fixed effects model, assuming that all cluster intercepts are distinct fixed parameters; moreover, a bootstrapping technique is implemented to replace asymptotic analysis. The random intercepts model is fitted using a maximum likelihood estimator with adaptive Gauss-Hermite and Laplace quadrature approximations of the likelihood function. The fixed effects model is fitted through a profiling approach, which is necessary when the number of clusters is large. In a simulation study, the two approaches are compared. The fixed effects model has severe bias when the mixed effects variance is positive and the number of clusters is large.  相似文献   

9.
Item response theory is one of the modern test theories with applications in educational and psychological testing. Recent developments made it possible to characterize some desired properties in terms of a collection of manifest ones, so that hypothesis tests on these traits can, in principle, be performed. But the existing test methodology is based on asymptotic approximation, which is impractical in most applications since the required sample sizes are often unrealistically huge. To overcome this problem, a class of tests is proposed for making exact statistical inference about four manifest properties: covariances given the sum are non-positive (CSN), manifest monotonicity (MM), conditional association (CA), and vanishing conditional dependence (VCD). One major advantage is that these exact tests do not require large sample sizes. As a result, tests for CSN and MM can be routinely performed in empirical studies. For testing CA and VCD, the exact methods are still impractical in most applications, due to the unusually large number of parameters to be tested. However, exact methods are still derived for them as an exploration toward practicality. Some numerical examples with applications of the exact tests for CSN and MM are provided.  相似文献   

10.
A model-based clustering method is proposed for clustering individuals on the basis of measurements taken over time. Data variability is taken into account through non-linear hierarchical models leading to a mixture of hierarchical models. We study both frequentist and Bayesian estimation procedures. From a classical viewpoint, we discuss maximum likelihood estimation of this family of models through the EM algorithm. From a Bayesian standpoint, we develop appropriate Markov chain Monte Carlo (MCMC) sampling schemes for the exploration of target posterior distribution of parameters. The methods are illustrated with the identification of hormone trajectories that are likely to lead to adverse pregnancy outcomes in a group of pregnant women.  相似文献   

11.
While latent variable models have been successfully applied in many fields and underpin various modeling techniques, their ability to incorporate categorical responses is hindered due to the lack of accurate and efficient estimation methods. Approximation procedures, such as penalized quasi-likelihood, are computationally efficient, but the resulting estimators can be seriously biased for binary responses. Gauss-Hermite quadrature and Markov Chain Monte Carlo (MCMC) integration based methods can yield more accurate estimation, but they are computationally much more intensive. Estimation methods that can achieve both computational efficiency and estimation accuracy are still under development. This paper proposes an efficient direct sampling based Monte Carlo EM algorithm (DSMCEM) for latent variable models with binary responses. Mixed effects and item factor analysis models with binary responses are used to illustrate this algorithm. Results from two simulation studies and a real data example suggest that, as compared with MCMC based EM, DSMCEM can significantly improve computational efficiency as well as produce equally accurate parameter estimates. Other aspects and extensions of the algorithm are discussed.  相似文献   

12.
Bayesian inference has commonly been performed on nonlinear mixed effects models. However, there is a lack of research into performing Bayesian optimal design for nonlinear mixed effects models, especially those that require searches to be performed over several design variables. This is likely due to the fact that it is much more computationally intensive to perform optimal experimental design for nonlinear mixed effects models than it is to perform inference in the Bayesian framework. Fully Bayesian experimental designs for nonlinear mixed effects models are presented, which involve the use of simulation-based optimal design methods to search over both continuous and discrete design spaces. The design problem is to determine the optimal number of subjects and samples per subject, as well as the (near) optimal urine sampling times for a population pharmacokinetic study in horses, so that the population pharmacokinetic parameters can be precisely estimated, subject to cost constraints. The optimal sampling strategies, in terms of the number of subjects and the number of samples per subject, were found to be substantially different between the examples considered in this work, which highlights the fact that the designs are rather problem-dependent and can be addressed using the methods presented.  相似文献   

13.
This article presents a robust identification approach for nonlinear errors-in-variables (EIV) systems contaminated with outliers. In this work, the measurement noise is modelled using the t-distribution, instead of the traditional Gaussian distribution, to mitigate the effect of the outliers. The heavier tails of the t-distribution, through the adjustable degrees of freedom, is used to account for noise and outliers concomitantly. Further, to avoid the intricacies related to the direct nonlinear identification, we propose to approximate the nonlinear EIV dynamics using multiple local ARX models and aggregating them using an exponential weighting strategy. The parameters of the local models and weighting parameters are estimated using the expectation maximization (EM) algorithm, under the framework of the maximum likelihood estimation (MLE). The studies with simulated numerical examples and an experiment on a multi-tank system demonstrate the superiority of the proposed method.  相似文献   

14.
Multi-level nonlinear mixed effects (ML-NLME) models have received a great deal of attention in recent years because of the flexibility they offer in handling the repeated-measures data arising from various disciplines. In this study, we propose both maximum likelihood and restricted maximum likelihood estimations of ML-NLME models with two-level random effects, using first order conditional expansion (FOCE) and the expectation–maximization (EM) algorithm. The FOCE–EM algorithm was compared with the most popular Lindstrom and Bates (LB) method in terms of computational and statistical properties. Basal area growth series data measured from Chinese fir (Cunninghamia lanceolata) experimental stands and simulated data were used for evaluation. The FOCE–EM and LB algorithms given the same parameter estimates and fit statistics for models that converged by both. However, FOCE–EM converged for all the models, while LB did not, especially for the models in which two-level random effects are simultaneously considered in several base parameters to account for between-group variation. We recommend the use of FOCE–EM in ML-NLME models, particularly when convergence is a concern in model selection.  相似文献   

15.
Generalized linear mixed models (GLMMs) have wide applications in practice. Similar to other data analyses, the identification of influential observations that may be potential outliers is an important step beyond estimation in GLMMs. Since the pioneering work of Cook in 1977, deletion measures have been applied to many statistical models for identifying influential observations. However, as this well-known approach is based on the observed-data likelihood, it is very difficult to apply it to developing diagnostic measures for GLMMs due to the complexity of the observed-data likelihood that involves multidimensional integrals. The objective of this article is to develop diagnostic measures for identifying influential observations. Deletion measures are developed on the basis of the conditional expectation of the complete-data log-likelihood at the E-step of a stochastic approximation Markov chain Monte Carlo algorithm. Making use of by-products of the estimation to compute building blocks of the proposed diagnostic measures and activating appropriate approximations, the proposed methods require little additional computation. The performance of the methods is illustrated by an artificial example, a real example, and some simulation studies.  相似文献   

16.
Markov chain Monte Carlo (MCMC) algorithms have greatly facilitated the popularity of Bayesian variable selection and model averaging in problems with high-dimensional covariates where enumeration of the model space is infeasible. A variety of such algorithms have been proposed in the literature for sampling models from the posterior distribution in Bayesian variable selection. Ghosh and Clyde proposed a method to exploit the properties of orthogonal design matrices. Their data augmentation algorithm scales up the computation tremendously compared to traditional Gibbs samplers, and leads to the availability of Rao-Blackwellized estimates of quantities of interest for the original non-orthogonal problem. The algorithm has excellent performance when the correlations among the columns of the design matrix are small, but empirical results suggest that moderate to strong multicollinearity leads to slow mixing. This motivates the need to develop a class of novel sandwich algorithms for Bayesian variable selection that improves upon the algorithm of Ghosh and Clyde. It is proved that the Haar algorithm with the largest group that acts on the space of models is the optimum algorithm, within the parameter expansion data augmentation (PXDA) class of sandwich algorithms. The result provides theoretical insight but using the largest group is computationally prohibitive so two new computationally viable sandwich algorithms are developed, which are inspired by the Haar algorithm, but do not necessarily belong to the class of PXDA algorithms. It is illustrated via simulation studies and real data analysis that several of the sandwich algorithms can offer substantial gains in the presence of multicollinearity.  相似文献   

17.
An Introduction to MCMC for Machine Learning   总被引:35,自引:0,他引:35  
This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of this special issue. Lastly, it discusses new interesting research horizons.  相似文献   

18.
A simple test for threshold nonlinearity in either the mean or volatility equation, or both, of a heteroskedastic time series model is proposed. The procedure extends current Bayesian Markov chain Monte Carlo methods and threshold modelling by employing a general double threshold GARCH model that allows for an explosive, non-stationary regime. Posterior credible intervals on model parameters are used to detect and specify threshold nonlinearity in the mean and/or volatility equations. Simulation experiments demonstrate that the method works favorably in identifying model specifications varying in complexity from the conventional GARCH up to the full double-threshold nonlinear GARCH model with an explosive regime, and is robust to over-specification in model orders.  相似文献   

19.
In this paper we use Markov chain Monte Carlo (MCMC) methods in order to estimate and compare GARCH models from a Bayesian perspective. We allow for possibly heavy tailed and asymmetric distributions in the error term. We use a general method proposed in the literature to introduce skewness into a continuous unimodal and symmetric distribution. For each model we compute an approximation to the marginal likelihood, based on the MCMC output. From these approximations we compute Bayes factors and posterior model probabilities.  相似文献   

20.
Learning spatial models from sensor data raises the challenging data association problem of relating model parameters to individual measurements. This paper proposes an EM-based algorithm, which solves the model learning and the data association problem in parallel. The algorithm is developed in the context of the the structure from motion problem, which is the problem of estimating a 3D scene model from a collection of image data. To accommodate the spatial constraints in this domain, we compute virtual measurements as sufficient statistics to be used in the M-step. We develop an efficient Markov chain Monte Carlo sampling method called chain flipping, to calculate these statistics in the E-step. Experimental results show that we can solve hard data association problems when learning models of 3D scenes, and that we can do so efficiently. We conjecture that this approach can be applied to a broad range of model learning problems from sensordata, such as the robot mapping problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号