期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

ORTH: R and SAS software for regression models of correlated binary data based on orthogonalized residuals and alternating logistic regressions

Kunthel By Bahjat F. Qaqish John S. Preisser Jamie Perin Richard C. Zink 《Computer methods and programs in biomedicine》2014

This article describes a new software for modeling correlated binary data based on orthogonalized residuals, a recently developed estimating equations approach that includes, as a special case, alternating logistic regressions. The software is flexible with respect to fitting in that the user can choose estimating equations for association models based on alternating logistic regressions or orthogonalized residuals, the latter choice providing a non-diagonal working covariance matrix for second moment parameters providing potentially greater efficiency. Regression diagnostics based on this method are also implemented in the software. The mathematical background is briefly reviewed and the software is applied to medical data sets. 相似文献

2.

Comparison of logistic regression model and classification tree: An application to postpartum depression data

Handan Ankarali Camdeviren Ayse Canan Yazici Zeki Akkus Resul Bugdayci Mehmet Ali Sungur 《Expert systems with applications》2007,32(4):987-994

In this study, it is aimed that comparing logistic regression model with classification tree method in determining social-demographic risk factors which have effected depression status of 1447 women in separate postpartum periods. In determination of risk factors, data obtained from prevalence study of postpartum depression were used. Cut-off value of postpartum depression scores that calculated was taken as 13. Social and demographic risk factors were brought up by helping of the classification tree and logistic regression model. According to optimal classification tree total of six risk factors were determined, but in logistic regression model 3 of their effect were found significantly. In addition, during the relations among risk factors in tree structure were being evaluated, in logistic regression model corrected main effects belong to risk factors were calculated. In spite of, classification success of maximal tree was found better than both optimal tree and logistic regression model, it is seen that using this tree structure in practice is very difficult. But we say that the logistic regression model and optimal tree had the lower sensitivity, possibly due to the fact that numbers of the individuals in both two groups were not equal and clinical risk factors were not considered in this study. Classification tree method gives more information with detail on diagnosis by evaluating a lot of risk factors together than logistic regression model. But making correct selection through constructed tree structures is very important to increase the success of results and to reach information which can provide appropriate explanations. 相似文献

3.

Goodness-of-fit tests for logistic regression models when data are collected using a complex sampling design

Kellie J. Archer Stanley Lemeshow 《Computational statistics & data analysis》2007,51(9):4450-4464

Logistic regression models are frequently used in epidemiological studies for estimating associations that demographic, behavioral, and risk factor variables have on a dichotomous outcome, such as disease being present versus absent. After the coefficients in a logistic regression model have been estimated, goodness-of-fit of the resulting model should be examined, particularly if the purpose of the model is to estimate probabilities of event occurrences. While various goodness-of-fit tests have been proposed, the properties of these tests have been studied under the assumption that observations selected were independent and identically distributed. Increasingly, epidemiologists are using large-scale sample survey data when fitting logistic regression models, such as the National Health Interview Survey or the National Health and Nutrition Examination Survey. Unfortunately, for such situations no goodness-of-fit testing procedures have been developed or implemented in available software. To address this problem, goodness-of-fit tests for logistic regression models when data are collected using complex sampling designs are proposed. Properties of the proposed tests were examined using extensive simulation studies and results were compared to traditional goodness-of-fit tests. A Stata ado function svylogitgof for estimating the F-adjusted mean residual test after svylogit fit is available at the author's website http://www.people.vcu.edu/~kjarcher/Research/Data.htm. 相似文献

4.

Mean field variational Bayesian inference for nonparametric regression with measurement error

《Computational statistics & data analysis》2013

A fast mean field variational Bayes (MFVB) approach to nonparametric regression when the predictors are subject to classical measurement error is investigated. It is shown that the use of such technology to the measurement error setting achieves reasonable accuracy. In tandem with the methodological development, a customized Markov chain Monte Carlo method is developed to facilitate the evaluation of accuracy of the MFVB method. 相似文献

5.

The logistic regression model with response variables subject to randomized response

Ardo van den Hout Peter G.M. van der Heijden 《Computational statistics & data analysis》2007,51(12):6060-6069

The univariate and multivariate logistic regression model is discussed where response variables are subject to randomized response (RR). RR is an interview technique that can be used when sensitive questions have to be asked and respondents are reluctant to answer directly. RR variables may be described as misclassified categorical variables where conditional misclassification probabilities are known. The univariate model is revisited and is presented as a generalized linear model. Standard software can be easily adjusted to take into account the RR design. The multivariate model does not appear to have been considered elsewhere in an RR setting; it is shown how a Fisher scoring algorithm can be used to take the RR aspect into account. The approach is illustrated by analyzing RR data taken from a study in regulatory non-compliance regarding unemployment benefit. 相似文献

6.

Increasing the power: A practical approach to goodness-of-fit test for logistic regression models with continuous predictors

Xian-Jin Xie Jane Pendergast 《Computational statistics & data analysis》2008,52(5):2703-2713

When continuous predictors are present, classical Pearson and deviance goodness-of-fit tests to assess logistic model fit break down. The Hosmer-Lemeshow test can be used in these situations. While simple to perform and widely used, it does not have desirable power in many cases and provides no further information on the source of any detectable lack of fit. Tsiatis proposed a score statistic to test for covariate regional effects. While conceptually elegant, its lack of a general rule for how to partition the covariate space has, to a certain degree, limited its popularity. We propose a new method for goodness-of-fit testing that uses a very general partitioning strategy (clustering) in the covariate space and either a Pearson statistic or a score statistic. Properties of the proposed statistics are discussed, and a simulation study demonstrates increased power to detect model misspecification in a variety of settings. An application of these different methods on data from a clinical trial illustrates their use. Discussions on further improvement of the proposed tests and extending this new method to other data situations, such as ordinal response regression models are also included. 相似文献

7.

Using logistic regression formulation to monitor heterogeneous usage rate for subscription-based services

Y. Samimi A. Aghaie 《Computers & Industrial Engineering》2011

相似文献

8.

High-dimensional pseudo-logistic regression and classification with applications to gene expression data

Chunming Zhang Haoda Fu Tao Yu 《Computational statistics & data analysis》2007,52(1):452-470

High dimension low sample size data, like the microarray gene expression levels, pose numerous challenges to conventional statistical methods. In the particular case of binary classification, some classification methods, such as the support vector machine (SVM), can efficiently deal with high-dimensional predictors, but lacks the accuracy in estimating the probability of membership of a class. In contrast, the traditional logistic regression (TLR) effectively estimates the probability of class membership for data with low-dimensional inputs, but does not handle high-dimensional cases. The study bridges the gap between SVM and TLR by their loss functions. Based on the proposed new loss function, a pseudo-logistic regression and classification approach which simultaneously combines the strengths of both SVM and TLR is also proposed. Simulation evaluations and real data applications demonstrate that for low-dimensional data, the proposed method produces regression estimates comparable to those of TLR and penalized logistic regression, and that for high-dimensional data, the new method possesses higher classification accuracy than SVM and, in the meanwhile, enjoys enhanced computational convergence and stability. 相似文献

9.

Prediction measurement with mean acceptable error for proper inconsistency in noisy weldability prediction data

《Robotics and Computer》2017

Due to the complex nature of the welding process, the data used to construct prediction models often contain a significant amount of inconsistency. In general, this type of inconsistent data is treated as noise in the literature. However, for the weldability prediction, the inconsistency, which we describe as proper-inconsistency, may not be eliminated since the inconsistent data can help extract additional information about the process. This paper discusses that, in the presence of proper-inconsistency, it is inappropriate to perform the same approach generally employed with machine learning algorithms, in terms of the model construction and prediction measurement. Due to the numerical characteristics of proper-inconsistency, it is likely to achieve vague prediction results from the prediction model with the traditional prediction performance measures. In this paper, we propose a new prediction performance measure called mean acceptable error (MACE), which measures the performance of prediction models constructed with the presence of proper-inconsistency. This paper presents experimental results with real weldability prediction data, and we examine the prediction performance of k-nearest neighbor (kNN) and generalized regression neural network (GRNN) measured by MACE and the different characteristics of data in relation to MACE, kNN, and GRNN. The results indicate that using a smaller k on properly-inconsistent data increases the prediction performance measured by MACE. Also, the prediction performance on the correct data increases, while the effect of properly-inconsistent data decreases with the measurement of MACE. 相似文献

10.

Local likelihood regression in generalized linear single-index models with applications to microarray data

Sophie Lambert-Lacroix Julie Peyre 《Computational statistics & data analysis》2006,51(3):2091-2113

Searching for an effective dimension reduction space is an important problem in regression, especially for high-dimensional data such as microarray data. A major characteristic of microarray data consists in the small number of observations n and a very large number of genes p. This “large p, small n” paradigm makes the discriminant analysis for classification difficult. In order to offset this dimensionality problem a solution consists in reducing the dimension. Supervised classification is understood as a regression problem with a small number of observations and a large number of covariates. A new approach for dimension reduction is proposed. This is based on a semi-parametric approach which uses local likelihood estimates for single-index generalized linear models. The asymptotic properties of this procedure are considered and its asymptotic performances are illustrated by simulations. Applications of this method when applied to binary and multiclass classification of the three real data sets Colon, Leukemia and SRBCT are presented. 相似文献

11.

捷联惯性传感器多余度配置的误差标定技术研究

华冰刘建业熊智李荣冰《传感器与微系统》2005,24(5):31-33

对捷联惯性传感器多余度配置系统的标定技术进行了研究。详细分析了多余度惯性传感器各参数的测量原理及计算公式,针对典型的非正交配置(六传感器正十二面体)的多余度惯性测量单元(IMU),提出了一种简易的且具有较高精度的误差模型参数静态标定方法,给出了计算误差模型参数的数学推导过程和解析表达式。仿真结果表明:该计算方法精度较高,可以有效估计出多余度IMU的误差模型参数,提高了惯导精度。相似文献