首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A fruitful method of pooling data from disparate sources, such as a set of sample surveys, is developed. This method proceeds by finding the first two moments of two conditional distributions derived from a joint distribution of two sample estimators of employment for each of several geographical areas. The nature of the two estimators is such that one of them can yield a better estimate of national employment than the other. The regression of the former estimator on the latter estimator with stochastic intercept and slope is used to generate an improved estimator that is equal to bias- and error-corrected estimator for each area with probability 1. This analysis is extended to cases where more than two estimates of employment are available for each area.  相似文献   

2.
A new nonparametric estimator for the conditional hazard rate is proposed, which is defined as the ratio of local linear estimators for the conditional density and survivor function. The resulting hazard rate estimator is shown to be pointwise consistent and asymptotically normally distributed under appropriate conditions. Furthermore, plug-in bandwidths based on normal and uniform reference distributions and minimizing the asymptotic mean squared error are derived. In terms of the mean squared error the new estimator is highly competitive in comparison to existing estimators for the conditional hazard rate. Moreover, its smoothing parameters are relatively robust to misspecification of the reference distributions, which facilitates bandwidth selection. Additionally, the new hazard rate estimator is conveniently calculated using standard software for local linear regression. The use of the local linear hazard rate is illustrated in an application to kidney transplant data.  相似文献   

3.
In this paper we consider the beta regression model recently proposed by Ferrari and Cribari-Neto [2004. Beta regression for modeling rates and proportions. J. Appl. Statist. 31, 799-815], which is tailored to situations where the response is restricted to the standard unit interval and the regression structure involves regressors and unknown parameters. We derive the second order biases of the maximum likelihood estimators and use them to define bias-adjusted estimators. As an alternative to the two analytically bias-corrected estimators discussed, we consider a bias correction mechanism based on the parametric bootstrap. The numerical evidence favors the bootstrap-based estimator and also one of the analytically corrected estimators. Several different strategies for interval estimation are also proposed. We present an empirical application.  相似文献   

4.
Missing data often occur in regression analysis. Imputation, weighting, direct likelihood, and Bayesian inference are typical approaches for missing data analysis. The focus is on missing covariate data, a common complication in the analysis of sample surveys and clinical trials. A key quantity when applying weighted estimators is the mean score contribution of observations with missing covariate(s), conditional on the observed covariates. This mean score can be estimated parametrically or nonparametrically by its empirical average using the complete case data in case of repeated values of the observed covariates, typically assuming categorical or categorized covariates. A nonparametric kernel based estimator is proposed for this mean score, allowing the full exploitation of the continuous nature of the covariates. The performance of the kernel based method is compared to that of a complete case analysis, inverse probability weighting, doubly robust estimators and multiple imputation, through simulations.  相似文献   

5.
Simple nonparametric estimators of the conditional distribution of a response variable given a continuous covariate are often useful in survival analysis. Since a few nonparametric estimation options are available, a comparison of the performance of these options may be of value to determine which approach to use in a given application. In this note, we compare various nonparametric estimators of the conditional survival function when the response is subject to interval- and right-censoring. The estimators considered are a generalization of Turnbull’s estimator proposed by Dehghan and Duchesne (2011) and two nonparametric estimators for complete or right-censored data used in conjunction with imputation methods, namely the Nadaraya-Watson and generalized Kaplan-Meier estimators. We study the finite sample integrated mean squared error properties of all these estimators by simulation and compare them to a semi-parametric estimator. We propose a rule-of-thumb based on simple sample summary statistics to choose the most appropriate among these estimators in practice.  相似文献   

6.
We consider a general multivariate conditional heteroskedastic model under a conditional distribution that is not necessarily normal. This model contains autoregressive conditional heteroskedastic (ARCH) models as a special class. We use the pseudo maximum likelihood estimation method and derive a new estimator of the asymptotic variance matrix for the pseudo maximum likelihood estimator. We also study four special cases in this class, which are conditional heteroskedastic autoregressive moving-average models, regression models with ARCH errors, models with constant conditional correlations, and ARCH in mean models.  相似文献   

7.
In extreme value statistics, the extreme value index is a well-known parameter to measure the tail heaviness of a distribution. Pareto-type distributions, with strictly positive extreme value index (or tail index) are considered. The most prominent extreme value methods are constructed on efficient maximum likelihood estimators based on specific parametric models which are fitted to excesses over large thresholds. Maximum likelihood estimators however are often not very robust, which makes them sensitive to few particular observations. Even in extreme value statistics, where the most extreme data usually receive most attention, this can constitute a serious problem. The problem is illustrated on a real data set from geopedology, in which a few abnormal soil measurements highly influence the estimates of the tail index. In order to overcome this problem, a robust estimator of the tail index is proposed, by combining a refinement of the Pareto approximation for the conditional distribution of relative excesses over a large threshold with an integrated squared error approach on partial density component estimation. It is shown that the influence function of this newly proposed estimator is bounded and through several simulations it is illustrated that it performs reasonably well at contaminated as well as uncontaminated data.  相似文献   

8.
A robust estimator for the tail index of Pareto-type distributions   总被引:1,自引:0,他引:1  
In extreme value statistics, the extreme value index is a well-known parameter to measure the tail heaviness of a distribution. Pareto-type distributions, with strictly positive extreme value index (or tail index) are considered. The most prominent extreme value methods are constructed on efficient maximum likelihood estimators based on specific parametric models which are fitted to excesses over large thresholds. Maximum likelihood estimators however are often not very robust, which makes them sensitive to few particular observations. Even in extreme value statistics, where the most extreme data usually receive most attention, this can constitute a serious problem. The problem is illustrated on a real data set from geopedology, in which a few abnormal soil measurements highly influence the estimates of the tail index. In order to overcome this problem, a robust estimator of the tail index is proposed, by combining a refinement of the Pareto approximation for the conditional distribution of relative excesses over a large threshold with an integrated squared error approach on partial density component estimation. It is shown that the influence function of this newly proposed estimator is bounded and through several simulations it is illustrated that it performs reasonably well at contaminated as well as uncontaminated data.  相似文献   

9.
Several new estimators of the marginal likelihood for complex non-Gaussian models are developed. These estimators make use of the output of auxiliary mixture sampling for count data and for binary and multinomial data. One of these estimators is based on combining Chib’s estimator with data augmentation as in auxiliary mixture sampling, while the other estimators are importance sampling and bridge sampling based on constructing an unsupervised importance density from the output of auxiliary mixture sampling. These estimators are applied to a logit regression model, to a Poisson regression model, to a binomial model with random intercept, as well as to state space modeling of count data.  相似文献   

10.
In this paper we propose a new estimator for regression problems in the form of the linear combination of quantile regressions. The proposed estimator is helpful for the conditional mean estimation when the error distribution is asymmetric and heteroscedastic.It is shown that the proposed estimator has the consistency under heteroscedastic regression model: Y=μ(X)+σ(Xe, where X is a vector of covariates, Y is a scalar response, e is a zero mean random variable independent of X and σ(X) is a positive value function. When the error term e is asymmetric, we show that the proposed estimator yields better conditional mean estimation performance than the other estimators. Numerical experiments both in synthetic and real data are shown to illustrate the usefulness of the proposed estimator.  相似文献   

11.
This paper develops a recursive, convergent estimator for some parameters of Gaussian mixtures. The M class conditional (component) densities of the mixture random variable are Gaussian with known and distinct means and unknown and possibly different variances. A joint estimator of M prior (mixing) probabilities and M class conditional variances is derived. Sufficient conditions on the data and control parameters are derived for the estimator to converge. Convergence of the estimator follows from the use of a stochastic approximation theorem. Techniques to extend the estimators for the case of successive class labels forming a Markov chain are mentioned. The estimator has applications in blind parameter estimation in digital communication with symbol dependent noise variance and in image compression.  相似文献   

12.
In this article, we apply the maximum trimmed likelihood (MTL) approach [Hadi, A.S., Luceño, A., 1997. Maximum trimmed likelihood estimators: a unified approach, examples, and algorithms. Comput. Statist. Data Anal. 25, 251-272] to obtain the robust estimators of multivariate location and shape, especially for data mixed with continuous and categorical variables. The forward search algorithm [Atkinson, A.C., 1994. Fast very robust methods for the detection of multiple outliers. J. Amer. Statist. Assoc. 89, 1329-1339] is adapted to compute the proposed MTL estimates. A simulation study shows that the proposed estimator outperforms the classical maximum likelihood estimator when outliers exist in data. Real data sets are also used to illustrate the method and results of the detection of the outliers.  相似文献   

13.
This paper considers binary classification. We assess a classifier in terms of the area under the ROC curve (AUC). We estimate three important parameters, the conditional AUC (conditional on a particular training set) and the mean and variance of this AUC. We derive, as well, a closed form expression of the variance of the estimator of the AUG. This expression exhibits several components of variance that facilitate an understanding for the sources of uncertainty of that estimate. In addition, we estimate this variance, i.e., the variance of the conditional AUC estimator. Our approach is nonparametric and based on general methods from U-statistics; it addresses the case where the data distribution is neither known nor modeled and where there are only two available data sets, the training and testing sets. Finally, we illustrate some simulation results for these estimators  相似文献   

14.
A procedure for efficient estimation of the trimmed mean of a random variable conditional on a set of covariates is proposed. For concreteness, the focus is on a financial application where the trimmed mean of interest corresponds to the conditional expected shortfall, which is known to be a coherent risk measure. The proposed class of estimators is based on representing the estimator as an integral of the conditional quantile function. Relative to the simple analog estimator that weights all conditional quantiles equally, asymptotic efficiency gains may be attained by giving different weights to the different conditional quantiles while penalizing excessive departures from uniform weighting. The approach presented here allows for either parametric or nonparametric modeling of the conditional quantiles and the weights, but is essentially nonparametric in spirit. The asymptotic properties of the proposed class of estimators are established. Their finite sample properties are illustrated through a set of Monte Carlo experiments and an empirical application1.  相似文献   

15.
We address the problem of estimating discrete, continuous, and conditional joint densities online, i.e., the algorithm is only provided the current example and its current estimate for its update. The family of proposed online density estimators, estimation of densities online (EDO), uses classifier chains to model dependencies among features, where each classifier in the chain estimates the probability of one particular feature. Because a single chain may not provide a reliable estimate, we also consider ensembles of classifier chains and ensembles of weighted classifier chains. For all density estimators, we provide consistency proofs and propose algorithms to perform certain inference tasks. The empirical evaluation of the estimators is conducted in several experiments and on datasets of up to several millions of instances. In the discrete case, we compare our estimators to density estimates computed by Bayesian structure learners. In the continuous case, we compare them to a state-of-the-art online density estimator. Our experiments demonstrate that, even though designed to work online, EDO delivers estimators of competitive accuracy compared to other density estimators (batch Bayesian structure learners on discrete datasets and the state-of-the-art online density estimator on continuous datasets). Besides achieving similar performance in these cases, EDO is also able to estimate densities with mixed types of variables, i.e., discrete and continuous random variables.  相似文献   

16.
We consider rank regression for clustered data analysis and investigate the induced smoothing method for obtaining the asymptotic covariance matrices of the parameter estimators. We prove that the induced estimating functions are asymptotically unbiased and the resulting estimators are strongly consistent and asymptotically normal. The induced smoothing approach provides an effective way for obtaining asymptotic covariance matrices for between- and within-cluster estimators and for a combined estimator to take account of within-cluster correlations. We also carry out extensive simulation studies to assess the performance of different estimators. The proposed methodology is substantially much faster in computation and more stable in numerical results than the existing methods. We apply the proposed methodology to a dataset from a randomized clinical trial.  相似文献   

17.
We address the sequence classification problem using a probabilistic model based on hidden Markov models (HMMs). In contrast to commonly-used likelihood-based learning methods such as the joint/conditional maximum likelihood estimator, we introduce a discriminative learning algorithm that focuses on class margin maximization. Our approach has two main advantages: (i) As an extension of support vector machines (SVMs) to sequential, non-Euclidean data, the approach inherits benefits of margin-based classifiers, such as the provable generalization error bounds. (ii) Unlike many algorithms based on non-parametric estimation of similarity measures that enforce weak constraints on the data domain, our approach utilizes the HMM’s latent Markov structure to regularize the model in the high-dimensional sequence space. We demonstrate significant improvements in classification performance of the proposed method in an extensive set of evaluations on time-series sequence data that frequently appear in data mining and computer vision domains.  相似文献   

18.
The conditional likelihood approach is a sensible choice for a hierarchical logistic regression model or other generalized regression models with binary data. However, its heavy computational burden limits its use, especially for the related mixed-effects model. A modified profile likelihood is used as an accurate approximation to conditional likelihood, and then the use of two methods for inferences for the hierarchical generalized regression models with mixed effects is proposed. One is based on a hierarchical likelihood and Laplace approximation method, and the other is based on a Markov chain Monte Carlo EM algorithm. The methods are applied to a meta-analysis model for trend estimation and the model for multi-arm trials. A simulation study is conducted to illustrate the performance of the proposed methods.  相似文献   

19.
When the selected parametric model for the covariance structure is far from the true one, the corresponding covariance estimator could have considerable bias. To balance the variability and bias of the covariance estimator, we employ a nonparametric method. In addition, as different mean structures may lead to different estimators of the covariance matrix, we choose a semiparametric model for the mean so as to provide a stable estimate of the covariance matrix. Based on the modified Cholesky decomposition of the covariance matrix, we construct the joint mean-covariance model by modeling the smooth functions using the spline method and estimate the associated parameters using the maximum likelihood approach. A simulation study and a real data analysis are conducted to illustrate the proposed approach and demonstrate the flexibility of the suggested model.  相似文献   

20.
A comparative study is presented regarding the performance of commonly used estimators of the fractional order of integration when data is contaminated by noise. In particular, measurement errors, additive outliers, temporary change outliers, and structural change outliers are addressed. It occurs that when the sample size is not too large, as is frequently the case for macroeconomic data, then non-persistent noise will generally bias the estimators of the memory parameter downwards. On the other hand, relatively more persistent noise like temporary change outliers and structural changes can have the opposite effect and thus bias the fractional parameter upwards. Surprisingly, with respect to the relative performance of the various estimators, the parametric conditional maximum likelihood estimator with modelling of the short run dynamics clearly outperforms the semiparametric estimators in the presence of noise that is not too persistent. However, when a non-zero mean is allowed for, it may reverse the conclusion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号