首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When analyzing survival data, the parameter estimates and consequently the relative risk estimates of a Cox model sometimes do not converge to finite values. This phenomenon is due to special conditions in a data set and is known as 'monotone likelihood'. Statistical software packages for Cox regression using the maximum likelihood method cannot appropriately deal with this problem. A new procedure to solve the problem has been proposed by G. Heinze, M. Schemper, A solution to the problem of monotone likelihood in Cox regression, Biometrics 57 (2001). It has been shown that unlike the standard maximum likelihood method, this method always leads to finite parameter estimates. We developed a SAS macro and an SPLUS library to make this method available from within one of these widely used statistical software packages. Our programs are also capable of performing interval estimation based on profile penalized log likelihood (PPL) and of plotting the PPL function as was suggested by G. Heinze, M. Schemper, A solution to the problem of monotone likelihood in Cox regression, Biometrics 57 (2001).  相似文献   

2.
Two SAS macro programs are presented that evaluate the relative importance of prognostic factors in the proportional hazards regression model and in the logistic regression model. The importance of a prognostic factor is quantified by the proportion of variation in the outcome attributable to this factor. For proportional hazards regression, the program %RELIMPCR uses the recently proposed measure V to calculate the proportion of explained variation (PEV). For the logistic model, the R(2) measure based on squared raw residuals is used by the program %RELIMPLR. Both programs are able to compute marginal and partial PEV, to compare PEVs of factors, of groups of factors, and even to compare PEVs of different models. The programs use a bootstrap resampling scheme to test differences of the PEVs of different factors. Confidence limits for P-values are provided. The programs further allow to base the computation of PEV on models with shrinked or bias-corrected parameter estimates. The SAS macros are freely available at www.akh-wien.ac.at/imc/biometrie/relimp  相似文献   

3.
Kuss and McLerran in a paper in this journal provide SAS code for the estimation of multinomial logistic models for correlated data. Their motivation derived from two papers that recommended to estimate such models using a Poisson likelihood, which is according to Kuss and McLerran "statistically correct but computationally inefficient". Kuss and McLerran propose several estimating methods. Some of these are based on the fact that the multinomial model is a multivariate binary model. Subsequently a procedure proposed by Wright is exploited to fit the models. In this paper we will show that the new computation methods, based on the approach by Wright, are statistically incorrect because they do not take into account that for multinomial data a multivariate link function is needed. An alternative estimation strategy is proposed using the clustered bootstrap.  相似文献   

4.
We are interested in testing hypotheses that concern the parameter of a logistic regression model. A robust Wald-type test based on a weighted Bianco and Yohai [ Bianco, A.M., Yohai, V.J., 1996. Robust estimation in the logistic regression model. In: H. Rieder (Ed) Robust Statistics, Data Analysis, and Computer Intensive Methods In: Lecture Notes in Statistics, vol. 109, Springer Verlag, New York, pp. 17–34] estimator, as implemented by Croux and Haesbroeck [Croux, C., Haesbroeck, G., 2003. Implementing the Bianco and Yohai estimator for logistic regression. Computational Statististics and Data Analysis 44, 273–295], is proposed. The asymptotic distribution of the test statistic is derived. We carry out an empirical study to get a further insight into the stability of the p-value. Finally, a Monte Carlo study is performed to investigate the stability of both the level and the power of the test, for different choices of the weight function.  相似文献   

5.
We show how multinomial logistic models with correlated responses can be estimated within SAS software. To achieve this, random effects and marginal models are introduced and the respective SAS code is given. An example data set on physicians' recommendations and preferences in traumatic brain injury rehabilitation is used for illustration. The main motivation for this work are two recent papers that recommend estimating multinomial logistic models with correlated responses by using a Poisson likelihood which is statistically correct but computationally inefficient.  相似文献   

6.
In this paper, we consider the problem of multinomial classification of magnetoencephalography (MEG) data. The proposed method participated in the MEG mind reading competition of ICANN’11 conference, where the goal was to train a classifier for predicting the movie the test person was shown. Our approach was the best among ten submissions, reaching accuracy of 68 % of correct classifications in this five category problem. The method is based on a regularized logistic regression model, whose efficient feature selection is critical for cases with more measurements than samples. Moreover, a special attention is paid to the estimation of the generalization error in order to avoid overfitting to the training data. Here, in addition to describing our competition entry in detail, we report selected additional experiments, which question the usefulness of complex feature extraction procedures and the basic frequency decomposition of MEG signal for this application.  相似文献   

7.
In clinical and epidemiologic research to investigate dose-response associations, non-parametric spline regression has long been proposed as a powerful alternative to conventional parametric regression approaches, since no underlying assumptions of linearity have to be fulfilled. For logistic spline models, however, to date, little standard statistical software is available to estimate any measure of risk, typically of interest when quantifying the effects of one or more continuous explanatory variable(s) on a binary disease outcome. In the present paper, we propose a set of SAS macros which perform non-parametric logistic regression analysis with B-spline expansions of an arbitrary number of continuous covariates, estimating adjusted odds ratios with respective confidence intervals for any given value with respect to a supplied reference value. Our SAS codes further allow to graphically visualize the shape of the association, retaining the exposure variable under consideration in its initial, continuous form while concurrently adjusting for multiple confounding factors. The macros are easily to use and can be implemented quickly by the clinical or epidemiological researcher to flexibly investigate any dose-response association of continuous exposures with the risk of binary disease outcomes. We illustrate the application of our SAS codes by investigating the effect of body-mass index on risk of cancer incidence in a large, population-based male cohort.  相似文献   

8.
This article describes a new software for modeling correlated binary data based on orthogonalized residuals, a recently developed estimating equations approach that includes, as a special case, alternating logistic regressions. The software is flexible with respect to fitting in that the user can choose estimating equations for association models based on alternating logistic regressions or orthogonalized residuals, the latter choice providing a non-diagonal working covariance matrix for second moment parameters providing potentially greater efficiency. Regression diagnostics based on this method are also implemented in the software. The mathematical background is briefly reviewed and the software is applied to medical data sets.  相似文献   

9.
10.
Technology credit scoring models have been used to screen loan applicant firms based on their technology. Typically a logistic regression model is employed to relate the probability of a loan default of the firms with several evaluation attributes associated with technology. However, these attributes are evaluated in linguistic expressions represented by fuzzy number. Besides, the possibility of loan default can be described in verbal terms as well. To handle these fuzzy input and output data, we proposed a fuzzy credit scoring model that can be applied to predict the default possibility of loan for a firm that is approved based on its technology. The method of fuzzy logistic regression as an appropriate prediction approach for credit scoring with fuzzy input and output was presented in this study. The performance of the model is improved compared to that of typical logistic regression. This study is expected to contribute to practical utilization of the technology credit scoring with linguistic evaluation attributes.  相似文献   

11.
Multimedia Tools and Applications - Orientation of human body is an important feature that can be used for behavioral analysis in surveillance systems. This cue contains useful information such as...  相似文献   

12.
The task of classifying is natural to humans, but there are situations in which a person is not best suited to perform this function, which creates the need for automatic methods of classification. Traditional methods, such as logistic regression, are commonly used in this type of situation, but they lack robustness and accuracy. These methods do not not work very well when the data or when there is noise in the data, situations that are common in expert and intelligent systems. Due to the importance and the increasing complexity of problems of this type, there is a need for methods that provide greater accuracy and interpretability of the results. Among these methods, is Boosting, which operates sequentially by applying a classification algorithm to reweighted versions of the training data set. It was recently shown that Boosting may also be viewed as a method for functional estimation. The purpose of the present study was to compare the logistic regressions estimated by the maximum likelihood model (LRMML) and the logistic regression model estimated using the Boosting algorithm, specifically the Binomial Boosting algorithm (LRMBB), and to select the model with the better fit and discrimination capacity in the situation of presence(absence) of a given property (in this case, binary classification). To illustrate this situation, the example used was to classify the presence (absence) of coronary heart disease (CHD) as a function of various biological variables collected from patients. It is shown in the simulations results based on the strength of the indications that the LRMBB model is more appropriate than the LRMML model for the adjustment of data sets with several covariables and noisy data. The following sections report lower values of the information criteria AIC and BIC for the LRMBB model and that the Hosmer–Lemeshow test exhibits no evidence of a bad fit for the LRMBB model. The LRMBB model also presented a higher AUC, sensitivity, specificity and accuracy and lower values of false positives rates and false negatives rates, making it a model with better discrimination power compared to the LRMML model. Based on these results, the logistic model adjusted via the Binomial Boosting algorithm (LRMBB model) is better suited to describe the problem of binary response, because it provides more accurate information regarding the problem considered.  相似文献   

13.
The epidemiological question of concern here is “can young children at risk of obesity be identified from their early growth records?” Pilot work using logistic regression to predict overweight and obese children demonstrated relatively limited success. Hence we investigate the incorporation of non-linear interactions to help improve accuracy of prediction; by comparing the result of logistic regression with those of six mature data mining techniques. The contributions of this paper are as follows: a) a comparison of logistic regression with six data mining techniques: specifically, for the prediction of overweight and obese children at 3 years using data recorded at birth, 6 weeks, 8 months and 2 years respectively; b) improved accuracy of prediction: prediction at 8 months accuracy is improved very slightly, in this case by using neural networks, whereas for prediction at 2 years obtained accuracy is improved by over 10%, in this case by using Bayesian methods. It has also been shown that incorporation of non-linear interactions could be important in epidemiological prediction, and that data mining techniques are becoming sufficiently well established to offer the medical research community a valid alternative to logistic regression.  相似文献   

14.
Modeling urban growth in Atlanta using logistic regression   总被引:15,自引:0,他引:15  
This study applied logistic regression to model urban growth in the Atlanta Metropolitan Area of Georgia in a GIS environment and to discover the relationship between urban growth and the driving forces. Historical land use/cover data of Atlanta were extracted from the 1987 and 1997 Landsat TM images. Multi-resolution calibration of a series of logistic regression models was conducted from 50 m to 300 m at intervals of 25 m. A fractal analysis pointed to 225 m as the optimal resolution of modeling. The following two groups of factors were found to affect urban growth in different degrees as indicated by odd ratios: (1) population density, distances to nearest urban clusters, activity centers and roads, and high/low density urban uses (all with odds ratios < 1); and (2) distance to the CBD, number of urban cells within a 7 × 7 cell window, bare land, crop/grass land, forest, and UTM northing coordinate (all with odds ratios > 1). A map of urban growth probability was calculated and used to predict future urban patterns. Relative operating characteristic (ROC) value of 0.85 indicates that the probability map is valid. It was concluded that despite logistic regression’s lack of temporal dynamics, it was spatially explicit and suitable for multi-scale analysis, and most importantly, allowed much deeper understanding of the forces driving the growth and the formation of the urban spatial pattern.  相似文献   

15.
Sepsis is one of the main causes of death for non-coronary ICU (Intensive Care Unit) patients and has become the 10th most common cause of death in western societies. This is a transversal condition affecting immunocompromised patients, critically ill patients, post-surgery patients, patients with AIDS, and the elderly. In western countries, septic patients account for as much as 25% of ICU bed utilization and the pathology affects 1-2% of all hospitalizations. Its mortality rates range from 12.8% for sepsis to 45.7% for septic shock.The prediction of mortality caused by sepsis is, therefore, a relevant research challenge from a medical viewpoint. The clinical indicators currently in use for this type of prediction have been criticized for their poor prognostic significance. In this study, we redescribe sepsis indicators through latent model-based feature extraction, using factor analysis. These extracted indicators are then applied to the prediction of mortality caused by sepsis. The reported results show that the proposed method improves on the results obtained with the current standard mortality predictor, which is based on the APACHE II score.  相似文献   

16.
To model fuzzy binary observations, a new model named “Fuzzy Logistic Regression” is proposed and discussed in this study. In fact, due to the vague nature of binary observations, no probability distribution can be considered for these data. Therefore, the ordinary logistic regression may not be appropriate. This study attempts to construct a fuzzy model based on possibility of success. These possibilities are defined by some linguistic terms such as …, low, medium, high…. Then, by use of the Extension principle, the logarithm transformation of “possibilistic odds” is modeled based on a set of crisp explanatory variables observations. Also, to estimate parameters in the proposed model, the least squares method in fuzzy linear regression is used. For evaluating the model, a criterion named the “capability index” is calculated. At the end, because of widespread applications of logistic regression in clinical studies and also, the abundance of vague observations in clinical diagnosis, the suspected cases to Systematic Lupus Erythematosus (SLE) disease is modeled based on some significant risk factors to detect the application of the model. The results showed that the proposed model could be a rational substituted model of an ordinary one in modeling the clinical vague status.  相似文献   

17.
In this article, two semiparametric approaches are developed for analyzing randomized response data with missing covariates in logistic regression model. One of the two proposed estimators is an extension of the validation likelihood estimator of Breslow and Cain [Breslow, N.E., and Cain, K.C. 1988. Logistic regression for two-stage case-control data. Biometrika. 75, 11-20]. The other is a joint conditional likelihood estimator based on both validation and non-validation data sets. We present a large sample theory for the proposed estimators. Simulation results show that the joint conditional likelihood estimator is more efficient than the validation likelihood estimator, weighted estimator, complete-case estimator and partial likelihood estimator. We also illustrate the methods using data from a cable TV study.  相似文献   

18.
Some accounting studies have focused on logistic regression relationships between exact/fuzzy inputs/outputs. However, intuitionistic fuzzy sets find application in many real studies instead of fuzzy sets. On the other hand, semi-parametric partially linear model also has attracted attentions in recent years. This study is an investigation of intuitionistic fuzzy semi-parametric partially logistic model for such cases with exact inputs, intuitionistic fuzzy outputs, intuitionistic fuzzy smooth function and intuitionistic fuzzy coefficients. For this purpose, a hybrid procedure is suggested based on curve fitting methods and least absolutes deviations to estimate the intuitionistic fuzzy smooth function and intuitionistic fuzzy coefficients. The proposed method is also compared with a common fuzzy logistic regression model as a real fuzzy data set. It is shown that the proposed intuitionistic fuzzy logistic regression model performs better and efficient results in regard to some goodness-of-fit criteria suggest that the proposed model could be successfully applied in many practical studies of intuitionistic fuzzy logistic regression model in expert systems.  相似文献   

19.
20.
The univariate and multivariate logistic regression model is discussed where response variables are subject to randomized response (RR). RR is an interview technique that can be used when sensitive questions have to be asked and respondents are reluctant to answer directly. RR variables may be described as misclassified categorical variables where conditional misclassification probabilities are known. The univariate model is revisited and is presented as a generalized linear model. Standard software can be easily adjusted to take into account the RR design. The multivariate model does not appear to have been considered elsewhere in an RR setting; it is shown how a Fisher scoring algorithm can be used to take the RR aspect into account. The approach is illustrated by analyzing RR data taken from a study in regulatory non-compliance regarding unemployment benefit.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号