首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 11 毫秒
1.
Generalized linear mixed models (GLMMs) have wide applications in practice. Similar to other data analyses, the identification of influential observations that may be potential outliers is an important step beyond estimation in GLMMs. Since the pioneering work of Cook in 1977, deletion measures have been applied to many statistical models for identifying influential observations. However, as this well-known approach is based on the observed-data likelihood, it is very difficult to apply it to developing diagnostic measures for GLMMs due to the complexity of the observed-data likelihood that involves multidimensional integrals. The objective of this article is to develop diagnostic measures for identifying influential observations. Deletion measures are developed on the basis of the conditional expectation of the complete-data log-likelihood at the E-step of a stochastic approximation Markov chain Monte Carlo algorithm. Making use of by-products of the estimation to compute building blocks of the proposed diagnostic measures and activating appropriate approximations, the proposed methods require little additional computation. The performance of the methods is illustrated by an artificial example, a real example, and some simulation studies.  相似文献   

2.
In this paper, we derive a small sample Akaike information criterion, based on the maximized loglikelihood, and a small sample information criterion based on the maximized restricted loglikelihood in the linear mixed effects model when the covariance matrix of the random effects is known. Small sample corrected information criteria are proposed for a special case of linear mixed effects models, the balanced random-coefficient model, without assuming the random coefficients covariance matrix to be known. A simulation study comparing the derived criteria and several others for model selection in the linear mixed effects models is presented. We illustrate the behavior of the studied information criteria on real data from a study of subjects coinfected with HIV and Hepatitis C virus. Robustness of the criteria, in terms of the error distributed as a mixture of normal distributions, is also studied. Special attention is given to the behavior of the conditional AIC by Vaida and Blanchard (2005). Among the studied criteria, GIC performs best, while cAIC exhibits poor performance. Because of its inferior performance, as demonstrated in this work, we do not recommend its use for model selection in linear mixed effects models.  相似文献   

3.
Flexible modelling of random effects in linear mixed models has attracted some attention recently. In this paper, we propose the use of finite Gaussian mixtures as in Verbeke and Lesaffre [A linear mixed model with heterogeneity in the random-effects population, J. Amu. Statist. Assoc. 91, 217-221]. We adopt a fully Bayesian hierarchical framework that allows simultaneous estimation of the number of mixture components together with other model parameters. The technique employed is the Reversible Jump MCMC algorithm (Richardson and Green [On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion). J. Roy. Statist. Soc. Ser. B 59, 731-792]). This approach has the advantage of producing a direct comparison of different mixture models through posterior probabilities from a single run of the MCMC algorithm. Moreover, the Bayesian setting allows us to integrate over different mixture models to obtain a more robust density estimate of the random effects. We focus on linear mixed models with a random intercept and a random slope. Numerical results on simulated data sets and a real data set are provided to demonstrate the usefulness of the proposed method.  相似文献   

4.
The penalized calibration technique in survey sampling combines usual calibration and soft calibration by introducing a penalty term. Certain relevant estimates in survey sampling can be considered as penalized calibration estimates obtained as particular cases from an optimization problem with a common basic structure. In this framework, a case deletion diagnostic is proposed for a class of penalized calibration estimators including both design-based and model-based estimators. The diagnostic compares finite population parameter estimates and can be calculated from quantities related to the full data set. The resulting diagnostic is a function of the residual and leverage, as other diagnostics in regression models, and of the calibration weight, a singular feature in survey sampling. Moreover, a particular case, which includes the basic unit level model for small area estimation, is considered. Both a real and an artificial example are included to illustrate the diagnostic proposed. The results obtained clearly show that the proposed diagnostic depends on the calibration and soft-calibration variables, on the penalization term, as well as on the parameter to estimate.  相似文献   

5.
A variance shift outlier model (VSOM), previously used for detecting outliers in the linear model, is extended to the variance components model. This VSOM accommodates outliers as observations with inflated variance, with the status of the ith observation as an outlier indicated by the size of the associated shift in the variance. Likelihood ratio and score test statistics are assessed as objective measures for determining whether the ith observation has inflated variance and is therefore an outlier. It is shown that standard asymptotic distributions do not apply to these tests for a VSOM, and a modified distribution is proposed. A parametric bootstrap procedure is proposed to account for multiple testing. The VSOM framework is extended to account for outliers in random effects and is shown to have an advantage over case-deletion approaches. A simulation study is presented to verify the performance of the proposed tests. Challenges associated with computation and extensions of the VSOM to the general linear mixed model with correlated errors are discussed.  相似文献   

6.
Transformation of the response of a linear model is a popular method in practice when attempting to satisfy the assumptions of the model. Environmental research routinely uses log-transformations due to the nature of the observed data. The choice of the transformation is often made based upon previous experience or on the comparison of models with different transformed responses. Often a transformation parameter is estimated when fitting a model to a set of data. However, in practice interpretability becomes an issue, as it is only desired to know if a particular transformation is appropriate. Thus, inference tools for a hypothesized value of the transformation, such as the log-transformation in environmental exposure models, have their merit. An examination of hypothesis tests of the transformation parameter in the general linear mixed model will be beneficial due to its practical applications, particularly for areas of environmental research. The effect of outliers on inference about the transformation parameter is also studied.  相似文献   

7.
Non-Gaussian spatial data are common in many sciences such as environmental sciences, biology and epidemiology. Spatial generalized linear mixed models (SGLMMs) are flexible models for modeling these types of data. Maximum likelihood estimation in SGLMMs is usually made cumbersome due to the high-dimensional intractable integrals involved in the likelihood function and therefore the most commonly used approach for estimating SGLMMs is based on the Bayesian approach. This paper proposes a computationally efficient strategy to fit SGLMMs based on the data cloning (DC) method suggested by Lele et al. (2007). This method uses Markov chain Monte Carlo simulations from an artificially constructed distribution to calculate the maximum likelihood estimates and their standard errors. In this paper, the DC method is adapted and generalized to estimate SGLMMs and some of its asymptotic properties are explored. Performance of the method is illustrated by a set of simulated binary and Poisson count data and also data about car accidents in Mashhad, Iran. The focus is inference in SGLMMs for small and medium data sets.  相似文献   

8.
In this study, a model identification instrument to determine the variance component structure for generalized linear mixed models (glmms) is developed based on the conditional Akaike information (cai). In particular, an asymptotically unbiased estimator of the cai (denoted as caicc) is derived as the model selection criterion which takes the estimation uncertainty in the variance component parameters into consideration. The relationship between bias correction and generalized degree of freedom for glmms is also explored. Simulation results show that the estimator performs well. The proposed criterion demonstrates a high proportion of correct model identification for glmms. Two sets of real data (epilepsy seizure count data and polio incidence data) are used to illustrate the proposed model identification method.  相似文献   

9.
The choice of generalized linear mixed models is difficult, because it involves the selection of both fixed and random effects. Classical criteria like Akaike’s information criterion (AIC) are often not suitable for the latter task, and others which are useful in linear mixed models are difficult to extend to the generalized case, especially for overdispersed data. A predictive leave-one-out crossvalidation approach is suggested that can be applied for choosing both fixed and random effects, even in models with overdispersion, and is based on proper scoring rules. An attractive feature of this approach is the fact that the model has to be fitted just once to the data set, which makes computations fast and convenient. As the calculation of the leave-one-out predictive distribution is not possible analytically, it is shown how an iteratively weighted least squares algorithm combined with some analytic approximations can be used for this task. A simulation study and two applications of the methodology to binary and count data are provided, as well as comparisons with two other methods.  相似文献   

10.
The joint segmentation of multiple series is considered. A mixed linear model is used to account for both covariates and correlations between signals. An estimation algorithm based on EM which involves a new dynamic programming strategy for the segmentation step is proposed. The computational efficiency of this procedure is shown and its performance is assessed through simulation experiments. Applications are presented in the field of climatic data analysis.  相似文献   

11.
Generalized linear mixed models (GLMMs) are useful for modelling longitudinal and clustered data, but parameter estimation is very challenging because the likelihood may involve high-dimensional integrals that are analytically intractable. Gauss-Hermite quadrature (GHQ) approximation can be applied but is only suitable for low-dimensional random effects. Based on the Quasi-Monte Carlo (QMC) approximation, a heuristic approach is proposed to calculate the maximum likelihood estimates of parameters in the GLMM. The QMC points scattered uniformly on the high-dimensional integration domain are generated to replace the GHQ nodes. Compared to the GHQ approximation, the proposed method has many advantages such as its affordable computation, good approximation and fast convergence. Comparisons to the penalized quasi-likelihood estimation and Gibbs sampling are made using a real dataset and a simulation study. The real dataset is the salamander mating dataset whose modelling involves six 20-dimensional intractable integrals in the likelihood.  相似文献   

12.
Multi-level nonlinear mixed effects (ML-NLME) models have received a great deal of attention in recent years because of the flexibility they offer in handling the repeated-measures data arising from various disciplines. In this study, we propose both maximum likelihood and restricted maximum likelihood estimations of ML-NLME models with two-level random effects, using first order conditional expansion (FOCE) and the expectation–maximization (EM) algorithm. The FOCE–EM algorithm was compared with the most popular Lindstrom and Bates (LB) method in terms of computational and statistical properties. Basal area growth series data measured from Chinese fir (Cunninghamia lanceolata) experimental stands and simulated data were used for evaluation. The FOCE–EM and LB algorithms given the same parameter estimates and fit statistics for models that converged by both. However, FOCE–EM converged for all the models, while LB did not, especially for the models in which two-level random effects are simultaneously considered in several base parameters to account for between-group variation. We recommend the use of FOCE–EM in ML-NLME models, particularly when convergence is a concern in model selection.  相似文献   

13.
Implementing the Monte Carlo EM algorithm (MCEM) algorithm for finding maximum likelihood estimates (MLEs) in the nonlinear mixed effects model (NLMM) has encountered a great deal of difficulty in obtaining samples used for estimating the E step due to the intractability of the target distribution. Sampling methods such as Markov chain techniques and importance sampling have been used to alleviate such difficulty. The advantage of Markov chains is that they are applicable to a wider range of distributions than the approaches based on independent samples. However, in many cases the computational cost of Markov chains is significantly greater than that of independent samplers. The MCEM algorithms based on independent samples allow for straightforward assessment of Monte Carlo error and can be considerably more efficient than those based on Markov chains when an efficient candidate distribution is chosen, which forms the motivation of this paper. The proposed MCEM algorithm in this paper uses samples obtained from an easy-to-simulate and efficient importance distribution so that the computational intensity and complexity is much reduced. Moreover, the proposed MCEM algorithm preserves the flexibility introduced by independent samples in gauging Monte Carlo error and thus allows the Monte Carlo sample size to increase with the number of EM iterations. We also introduce an EM algorithm using Gaussian quadrature approximations (GQEM) for the E step. In low-dimensional cases, the GQEM algorithm is more efficient than the proposed MCEM algorithm and thus can be used as an alternative. The performances of the proposed EM methods are compared to the existing ML estimators using real data examples and simulations.  相似文献   

14.
15.
线目标的缓冲区生成是缓冲区分析的基础和关键。结合栅格算法与矢量算法的优势,提出矢栅混合算法解决线目标的缓冲区生成问题。采用Douglas-Peuker方法对线目标进行重采样以加快缓冲区建立速度,用扫描线方法将线目标矢量数据转化为栅格形式,再采用膨胀原理生成缓冲区,通过扫描缓冲区栅格边界,提取有效矢量数据,进行求交运算,对缓冲区生成中的自相交多边形进行处理。  相似文献   

16.
In this paper, we give a polynomial algorithm to compute the infinite structure of a structured system. A directed graph is associated with such structured systems. The infinite zero orders can be computed on the associated graph via the determination of the minimal length of vertex disjoint input-output paths. This search corresponds to a minimum cost flow determination on an appropriate directed graph. The proposed algorithm is based on the primal-dual algorithm linked to linear programming. This polynomial algorithm is one of the most efficient for this type of problem. Moreover, it allows an iterative determination of the generic infinite structure which is a key tool for solving numerous control problems.  相似文献   

17.
An experimental design is called adaptive if the explanatory variables are chosen successively and at a fixed time the choice may be influenced by the results of the experiments up to that time. Adaptive designs are advantageous in non linear problems, when a good or optimal design depends on the true value of the unknown parameters, to achieve an asymptotically optimal design, but also in linear settings. For the latter case we propose a one-step adaptive design which is locally optimal with respect to all Φp-criteria, p ≥ 1, and globally superior to nonadaptive designs with respect to the A-criterion.  相似文献   

18.
19.
The optimal table row and column ordering can reveal useful patterns to improve reading and interpretation. Recently, genetic algorithms using standard crossover and mutation operators have been proposed to tackle this problem. In this paper, we carry out an experimental study that adds to this genetic algorithm crossover and mutation operators specially designed to deal with permutations and includes other parameters (initialization, replacement policy, mutation and crossover rates and stopping criteria) not examined in previous works. A proper analysis of the results must take into account all the parameters simultaneously, since the wrong conclusions can be drawn by studying each separately from the others. This is why we propose a framework for a multidimensional analysis of the results. This includes multiple hypothesis testing and a regression tree that builds a parsimonious and predictive model of the suitable configurations of the parameters.  相似文献   

20.
Monitoring on-line data to detect change point as early as possible is an important issue. It is shown that the existing CUSUM test is inefficient to quickly give an alarm when change point does not occur at the early stage of monitoring. In this paper we propose a set of new monitoring procedures to detect coefficients and error variance change in linear regression models. Our proposed modification, which uses a bandwidth parameter to change the beginning time of monitoring, can detect change point more quickly even if it occurs after a relative longer monitoring time. Simulations suggest that the modified procedures compared with the CUSUM test have the same null distribution but higher power and shorter average run length. In particular, we illustrate the effectiveness of our procedures by IBM stock data and Thailand/U.S. foreign exchange rate data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号