首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Regression models are proposed for joint analysis of Poisson and continuous longitudinal data with nonignorable missing values under fully parametric framework. Our primary interest is to evaluate the influence of the covariates on both Poisson and continuous responses. First, we form the full likelihood with complete data using the multivariate Poisson model and conditional multivariate normal distribution and then construct an ECM algorithm to find the maximum likelihood estimates of the model parameters. Then, under the assumption that the missingness mechanisms for the two responses are independent but nonignorable, namely, dependent on both observed and missing data of the two responses, we choose the logit model for the missingness mechanisms and selection model for the full likelihood. Also, we build two implementations of the Monte Carlo EM algorithm for estimating the parameters in the model. Wald test is employed to test the significance of covariates. Finally, we present the results of the Monte Carlo simulation to evaluate the performance of the proposed methodology and an application to the interstitial cystitis data base (ICDB) cohort study. To the best of our knowledge, our model is the first parametric model for joint analysis of Poisson and continuous longitudinal data with nonignorable missing value.  相似文献   

2.
ODMixed is a computer program to obtain optimal designs for linear mixed models of longitudinal studies. These designs account for heterogeneous correlated errors and for data with dropout. Designs are compared by using relative efficiencies, e.g., between a D-optimal design for homogeneous data and another for heterogeneous data or between a D-optimal design for complete data against another that optimizes designs when data is missing at random. Two examples are worked out to illustrate how researchers could use this computer program to profit of optimal design theory at the planning stage of longitudinal studies.  相似文献   

3.
Imputation of missing links and attributes in longitudinal social surveys   总被引:1,自引:0,他引:1  
The predictive analysis of longitudinal social surveys is highly sensitive to the effects of missing data in temporal observations. Such high sensitivity to missing values raises the need for accurate data imputation, because without it a large fraction of collected data could not be used properly. Previous studies focused on the treatment of missing data in longitudinal social networks due to non-respondents and dealt with the problem largely by imputing missing links in isolation or analyzing the imputation effects on network statistics. We propose to account for changing network topology and interdependence between actors’ links and attributes to construct a unified approach for imputation of links and attributes in longitudinal social surveys. The new method, based on an exponential random graph model, is evaluated experimentally for five scenarios of missing data models utilizing synthetic and real life datasets with 20 %–60 % of nodes missing. The obtained results outperformed all alternatives, four of which were link imputation methods and two node attribute imputation methods. We further discuss the applicability and scalability of our approach to real life problems and compare our model with the latest advancements in the field. Our findings suggest that the proposed method can be used as a viable imputation tool in longitudinal studies.  相似文献   

4.
This paper proposes the random subspace binary logit (RSBL) model (or random subspace binary logistic regression analysis) by taking the random subspace approach and using the classical logit model to generate a group of diverse logit decision agents from various perspectives for predictive problem. These diverse logit models are then combined for a more accurate analysis. The proposed RSBL model takes advantage of both logit (or logistic regression) and random subspace approaches. The random subspace approach generates diverse sets of variables to represent the current problem as different masks. Different logit decision agents from these masks, instead of a single logit model, are constructed. To verify its performance, we used the proposed RSBL model to forecast corporate failure in China. The results indicate that this model significantly improves the predictive ability of classical statistical models such as multivariate discriminant analysis, logit model, and probit model. Thus, the proposed model should make logit model more suitable for predictive problems in academic and industrial uses.  相似文献   

5.
Functional PLS logit regression model   总被引:1,自引:0,他引:1  
Functional logistic regression has been developed to forecast a binary response variable from a functional predictor. In order to fit this model, it is usual to assume that the functional observations and the parameter function of the model belong to a same finite space generated by a basis of functions. This consideration turns the functional model into a multiple logit model whose design matrix is the product of the matrix of sample paths basic coefficients and the matrix of the inner products between basic functions. The likelihood estimation of the parameter function of this model is very inaccurate due to the high dependence structure of the so obtained design matrix (multicollinearity). In order to solve this drawback several approaches have been proposed. These employ standard multivariate data analysis methods on the design matrix. This is the case of the functional principal component logistic regression model. As an alternative a functional partial least squares logit regression model is proposed, that has as covariates a set of partial least squares components of the design matrix of the multiple logit model associated to the functional one.  相似文献   

6.
The relationship between time evolution of stress and flares in Systemic Lupus Erythematosus patients has recently been studied. Daily stress data can be considered as observations of a single variable for a subject, carried out repeatedly at different time points (functional data). In this study, we propose a functional logistic regression model with the aim of predicting the probability of lupus flare (binary response variable) from a functional predictor variable (stress level). This method differs from the classical approach, in which longitudinal data are considered as observations of different correlated variables. The estimation of this functional model may be inaccurate due to multicollinearity, and so a principal component based solution is proposed. In addition, a new interpretation is made of the parameter function of the model, which enables the relationship between the response and the predictor variables to be evaluated. Finally, the results provided by different logit approaches (functional and longitudinal) are compared, using a sample of Lupus patients.  相似文献   

7.
The relationship between time evolution of stress and flares in Systemic Lupus Erythematosus patients has recently been studied. Daily stress data can be considered as observations of a single variable for a subject, carried out repeatedly at different time points (functional data). In this study, we propose a functional logistic regression model with the aim of predicting the probability of lupus flare (binary response variable) from a functional predictor variable (stress level). This method differs from the classical approach, in which longitudinal data are considered as observations of different correlated variables. The estimation of this functional model may be inaccurate due to multicollinearity, and so a principal component based solution is proposed. In addition, a new interpretation is made of the parameter function of the model, which enables the relationship between the response and the predictor variables to be evaluated. Finally, the results provided by different logit approaches (functional and longitudinal) are compared, using a sample of Lupus patients.  相似文献   

8.
OSWALD (Object-oriented Software for the Analysis of Longitudinal Data) is flexible and powerful software written for S-PLUS for the analysis of longitudinal data with dropout for which there is little other software available in the public domain. The implementation of OSWALD is described through analysis of a psychiatric clinical trial that compares antidepressant effects in an elderly depressed sample and a simulation study. In the simulation study, three different dropout mechanisms: completely random dropout (CRD), random dropout (RD) and informative dropout (ID), are considered and the results from using OSWALD are compared across mechanisms. The parameter estimates for ID-simulated data show less bias with OSWALD under the ID missing data assumption than under the CRD or RD assumptions. Under an ID mechanism, OSWALD does not provide standard error estimates. We supplement OSWALD with a bootstrap procedure to derive the standard errors. This report illustrates the usage of OSWALD for analyzing longitudinal data with dropouts and how to draw appropriate conclusions based on the analytic results under different assumptions regarding the dropout mechanism.  相似文献   

9.
Several new estimators of the marginal likelihood for complex non-Gaussian models are developed. These estimators make use of the output of auxiliary mixture sampling for count data and for binary and multinomial data. One of these estimators is based on combining Chib’s estimator with data augmentation as in auxiliary mixture sampling, while the other estimators are importance sampling and bridge sampling based on constructing an unsupervised importance density from the output of auxiliary mixture sampling. These estimators are applied to a logit regression model, to a Poisson regression model, to a binomial model with random intercept, as well as to state space modeling of count data.  相似文献   

10.
A new missing data algorithm ARFIL gives good results in spectral estimation. The log likelihood of a multivariate Gaussian random variable can always be written as a sum of conditional log likelihoods. For a complete set of autoregressive AR(p) data the best predictor in the likelihood requires only p previous observations. If observations are missing, the best AR predictor in the likelihood will in general include all previous observations. Using only those observations that fall within a finite time interval will approximate this likelihood. The resulting non-linear estimation algorithm requires no user provided starting values. In various simulations, the spectral accuracy of robust maximum likelihood methods was much better than the accuracy of other spectral estimates for randomly missing data.  相似文献   

11.
Time series of discrete random variables present unique statistical challenges due to serial correlation and uneven sampling intervals. While regression models for a series of counts are well developed, only few methods are discussed for the analysis of moderate to long (e.g. from 20 to 152 observations) binary or binomial time series. This article suggests generalized linear mixed models with autocorrelated random effects for a parameter-driven approach to such series. We use a Monte Carlo EM algorithm to jointly obtain maximum likelihood estimates of regression parameters and variance components. The likelihood approach, although computationally extensive, allows estimation of marginal joint probabilities of two or more serial events. These are crucial in checking the goodness-of-fit, whether the model adequately captures the serial correlation and for predicting future responses. The model is flexible enough to allow for missing observations or unequally spaced time intervals. We illustrate our approach and model assessment tools with an analysis of the series of winners in the traditional boat race between the universities of Oxford and Cambridge, re-evaluating a long-held belief about the effect of the weight of the crew on the odds of winning. We also show how our methods are useful in modeling trends based on the General Social Survey database.  相似文献   

12.
This paper examines properties of test statistics for random effects with incomplete panel data. We can divide incomplete panel data into two groups. One group arises from randomly missing or unbalanced data and the other arises from systematically missing data. We focus on the former case. Some statistical properties when there are missing independent variables in regression analysis are well known. A simple approach to treat missing observations is to just discard the missing cases, but such approach may be highly inefficient. In this paper, instead of discarding the missing cases, we consider the missing data to be the outcome of a random variable. The test statistic for random effects with randomly missing panel data is derived. We examine the statistical properties of the derived test statistic and compare it with test statistic derived without randomness. We find that our test statistic is conservative in comparison with the test statistic derived without randomness.  相似文献   

13.
Inference in Generalized linear mixed models with multivariate random effects is often made cumbersome by the high-dimensional intractable integrals involved in the marginal likelihood. An inferential methodology based on the marginal pairwise likelihood approach is proposed. This method belonging to the broad class of composite likelihood involves marginal pairs probabilities of the responses which has analytical expression for the probit version of the model, from where we derived those of the logit version. The different results are illustrated with a simulation study and with an analysis of a real data from health-related quality of life.  相似文献   

14.
A multilevel model for ordinal data in generalized linear mixed models (GLMM) framework is developed to account for the inherent dependencies among observations within clusters. Motivated by a data set from the British Social Attitudes Panel Survey (BSAPS), the random district effects and respondent effects are incorporated into the linear predictor to accommodate the nested clusterings. The fixed (random) effects are estimated (predicted) by maximizing the penalized quasi likelihood (PQL) function, whereas the variance component parameters are obtained via the restricted maximum likelihood (REML) estimation method. The model is employed to analyze the BSAPS data. Simulation studies are conducted to assess the performance of estimators.  相似文献   

15.
A general procedure for fitting growth curves is proposed that can be applied to longitudinal data even if observations are missing or irregularly spaced. Maximum likelihood estimates for mean growths are obtained from an EM algorithm. Estimates for standard errors, percentiles, and growth velocities are also produced. The techniques are demonstrated through the use of growth data from a longitudinal study of sickle cell disease.  相似文献   

16.
Mis-specification of the covariance structure in longitudinal data can result in loss of regression estimation efficiency and in misleading influence diagnostics. Therefore, a rule-of-thumb, even one that is rough, for detecting covariance mis-specification would prove valuable to data analysts. In this paper, we examine two indices for detecting the mis-specification of the covariance structure of longitudinal normal, Poisson or binary responses. Our work shows that the suggested indices prove to be worthwhile when there are no missing time observations; they, however, should be used with caution when there are MAR drop-outs.  相似文献   

17.
The current computational power and some recently developed algorithms allow a new automatic spectral analysis method for randomly missing data. Accurate spectra and autocorrelation functions are computed from the estimated parameters of time series models, without user interaction. If only a few data are missing, the accuracy is almost the same as when all observations were available. For larger missing fractions, low-order time series models can still be estimated with a good accuracy if the total observation time is long enough. Autoregressive models are best estimated with the maximum likelihood method if data are missing. Maximum likelihood estimates of moving average and of autoregressive moving average models are not very useful with missing data. Those models are found most accurately if they are derived from the estimated parameters of an intermediate autoregressive model. With statistical criteria for the selection of model order and model type, a completely automatic and numerically reliable algorithm is developed that estimates the spectrum and the autocorrelation function in randomly missing data problems. The accuracy was better than what can be obtained with other methods, including the famous expectation–maximization (EM) algorithm.  相似文献   

18.
The analysis of incomplete longitudinal data requires joint modeling of the longitudinal outcomes (observed and unobserved) and the response indicators. When non-response does not depend on the unobserved outcomes, within a likelihood framework, the missingness is said to be ignorable, obviating the need to formally model the process that drives it. For the non-ignorable or non-random case, estimation is less straightforward, because one must work with the observed data likelihood, which involves integration over the missing values, thereby giving rise to computational complexity, especially for high-dimensional missingness. The stochastic EM algorithm is a variation of the expectation-maximization (EM) algorithm and is particularly useful in cases where the E (expectation) step is intractable. Under the stochastic EM algorithm, the E-step is replaced by an S-step, in which the missing data are simulated from an appropriate conditional distribution. The method is appealing due to its computational simplicity. The SEM algorithm is used to fit non-random models for continuous longitudinal data with monotone or non-monotone missingness, using simulated, as well as case study, data. Resulting SEM estimates are compared with their direct likelihood counterparts wherever possible.  相似文献   

19.
Missing data are often encountered in data sets used to construct software effort prediction models. Thus far, the common practice has been to ignore observations with missing data. This may result in biased prediction models. The authors evaluate four missing data techniques (MDTs) in the context of software cost modeling: listwise deletion (LD), mean imputation (MI), similar response pattern imputation (SRPI), and full information maximum likelihood (FIML). We apply the MDTs to an ERP data set, and thereafter construct regression-based prediction models using the resulting data sets. The evaluation suggests that only FIML is appropriate when the data are not missing completely at random (MCAR). Unlike FIML, prediction models constructed on LD, MI and SRPI data sets will be biased unless the data are MCAR. Furthermore, compared to LD, MI and SRPI seem appropriate only if the resulting LD data set is too small to enable the construction of a meaningful regression-based prediction model  相似文献   

20.
In network‐based iterative learning control (ILC) systems, data dropout often occurs during data packet transfers from the remote plant to the ILC controller. This paper considers the problem of controller design for such ILC processes. Packet missing is modeled by stochastic variables satisfying the Bernoulli random binary distribution, which renders such an ILC system to be a stochastic one. Then, the design of ILC law is transformed into the stabilization of a 2‐D stochastic system described by the Roesser model. A sufficient condition for mean‐square asymptotic stability is established by means of a linear matrix inequality technique, and formulas can be given for the control law design simultaneously. This result is further extended to more general cases where the system matrices also contain uncertain parameters. The effectiveness and merits of the proposed method are illustrated by a numerical example. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号