首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 812 毫秒
1.
The maximum likelihood estimator (MLE) has commonly been used to estimate the unknown parameters in a finite mixture of distributions. However, the MLE can be very sensitive to outliers in the data. In order to overcome this the trimmed likelihood estimator (TLE) is proposed to estimate mixtures in a robust way. The superiority of this approach in comparison with the MLE is illustrated by examples and simulation studies. Moreover, as a prominent measure of robustness, the breakdown point (BDP) of the TLE for the mixture component parameters is characterized. The relationship of the TLE with various other approaches that have incorporated robustness in fitting mixtures and clustering are also discussed in this context.  相似文献   

2.
In this paper, we first discuss the origin, developments and various thoughts by several researchers on the generalized linear regression estimator (GREG) due to Deville and Särndal [Deville, J.C., Särndal, C.E., 1992. Calibration estimators in survey sampling. J. Amer. Statist. Assoc. 87, 376-382]. Then, the problem of estimation of the general parameter of interest considered by Rao [Rao, J.N.K., 1994. Estimating totals and distribution functions using auxiliary information at the estimation stage. J. Official Statist. 10 (2), 153-165], and Singh [Singh, S., 2001. Generalized calibration approach for estimating the variance in survey sampling. Ann. Inst. Statist. Math. 53 (2), 404-417; Singh, S., 2004. Golden and Silver Jubilee Year-2003 of the linear regression estimators. In: Proceedings of the Joint Statistical Meeting, Toronto (Available on the CD), 4382-4380; Singh, S., 2006. Survey statisticians celebrate Golden Jubilee Year-2003 of the linear regression estimator. Metrika 1-18] is further investigated. In addition to that it is shown that the Farrell and Singh [Farrell, P.J., Singh, S., 2005. Model-assisted higher order calibration of estimators of variance. Australian & New Zealand J. Statist. 47 (3), 375-383] estimators are also a special case of the proposed methodology. Interestingly, it has been noted that the single model assisted calibration constraint studied by Farrell and Singh [Farrell, P.J., Singh, S., 2002. Re-calibration of higher order calibration weights. Presented at Statistical Society of Canada conference, Hamilton (Available on CD); Farrell, P.J., Singh, S., 2005. Model-assisted higher order calibration of estimators of variance. Australian & New Zealand J. Statist. 47 (3), 375-383] and Wu [Wu, C., 2003. Optimal calibration estimators in survey sampling. Biometrika 90, 937-951] is not helpful for calibrating the Sen [Sen, A.R., 1953. On the estimate of the variance in sampling with varying probabilities. J. Indian Soc. Agril. Statist. 5, 119-127] and Yates and Grundy [Yates, F., Grundy, P.M., 1953. Selection without replacement from within strata with probability proportional to size. J. Roy. Statist. Soc. Ser. 15, 253-261] estimator of the variance of the linear regression estimator under the optimal designs of Godambe and Joshi [Godambe, V.P., Joshi, V.M., 1965. Admissibility and Bayes estimation in sampling finite populations—I. Ann. Math. Statist. 36, 1707-1722]. Three new estimators of the variance of the proposed linear regression type estimator of the general parameters of interest are introduced and compared with each other. The newly proposed two-dimensional linear regression models are found to be useful, unlike a simulation based on a couple of thousands of random samples, in comparing the estimators of variance. The use of knowledge of the model parameters in assisting the estimators of variance has been found to be beneficial. The most attractive feature is that it has been shown theoretically that the proposed method of calibration always remains more efficient than the GREG estimator.  相似文献   

3.
The assumption of equal variance in the normal regression model is not always appropriate. Cook and Weisberg (1983) provide a score test to detect heteroscedasticity, while Patterson and Thompson (1971) propose the residual maximum likelihood (REML) estimation to estimate variance components in the context of an unbalanced incomplete-block design. REML is often preferred to the maximum likelihood estimation as a method of estimating covariance parameters in a linear model. However, outliers may have some effect on the estimate of the variance function. This paper incorporates the maximum trimming likelihood estimation ( [Hadi and Luce?o, 1997] and [Vandev and Neykov, 1998]) in REML to obtain a robust estimation of modelling variance heterogeneity. Both the forward search algorithm of Atkinson (1994) and the fast algorithm of Neykov et al. (2007) are employed to find the resulting estimator. Simulation and real data examples are used to illustrate the performance of the proposed approach.  相似文献   

4.
When fitting models to data containing multiple structures, such as when fitting surface patches to data taken from a neighborhood that includes a range discontinuity, robust estimators must tolerate both gross outliers and pseudo outliers. Pseudo outliers are outliers to the structure of interest, but inliers to a different structure. They differ from gross outliers because of their coherence. Such data occurs frequently in computer vision problems, including motion estimation, model fitting, and range data analysis. The focus in this paper is the problem of fitting surfaces near discontinuities in range data. To characterize the performance of least median of the squares, least trimmed squares, M-estimators, Hough transforms, RANSAC, and MINPRAN on this type of data, the “pseudo outlier bias” metric is developed using techniques from the robust statistics literature, and it is used to study the error in robust fits caused by distributions modeling various types of discontinuities. The results show each robust estimator to be biased at small, but substantial, discontinuities. They also show the circumstances under which different estimators are most effective. Most importantly, the results imply present estimators should be used with care, and new estimators should be developed  相似文献   

5.
The problem of estimating the width of the symmetric uniform distribution on the line when data are measured with normal additive error is considered. The main purpose is to discuss the efficiency of the maximum likelihood estimator and the moment method estimator. It is shown that the model is regular and that the maximum likelihood estimator is more efficient than the moment method estimator. A sufficient condition is also given for the existence of both estimators.  相似文献   

6.
In this paper we consider the beta regression model recently proposed by Ferrari and Cribari-Neto [2004. Beta regression for modeling rates and proportions. J. Appl. Statist. 31, 799-815], which is tailored to situations where the response is restricted to the standard unit interval and the regression structure involves regressors and unknown parameters. We derive the second order biases of the maximum likelihood estimators and use them to define bias-adjusted estimators. As an alternative to the two analytically bias-corrected estimators discussed, we consider a bias correction mechanism based on the parametric bootstrap. The numerical evidence favors the bootstrap-based estimator and also one of the analytically corrected estimators. Several different strategies for interval estimation are also proposed. We present an empirical application.  相似文献   

7.
In this article, two semiparametric approaches are developed for analyzing randomized response data with missing covariates in logistic regression model. One of the two proposed estimators is an extension of the validation likelihood estimator of Breslow and Cain [Breslow, N.E., and Cain, K.C. 1988. Logistic regression for two-stage case-control data. Biometrika. 75, 11-20]. The other is a joint conditional likelihood estimator based on both validation and non-validation data sets. We present a large sample theory for the proposed estimators. Simulation results show that the joint conditional likelihood estimator is more efficient than the validation likelihood estimator, weighted estimator, complete-case estimator and partial likelihood estimator. We also illustrate the methods using data from a cable TV study.  相似文献   

8.
It is well known now that the minimum Hellinger distance estimation approach introduced by Beran (Beran, R., 1977. Minimum Hellinger distance estimators for parametric models. Ann. Statist. 5, 445-463) produces estimators that achieve efficiency at the model density and simultaneously have excellent robustness properties. However, computational difficulties and algorithmic convergence problems associated with this method have hampered its application in practice, particularly when the method is applied to models with high-dimensional parameter spaces. A one-step minimum Hellinger distance (MHD) procedure is investigated in this paper to overcome computational drawbacks of the fully iterative MHD method. The idea is to start with an initial estimator, and then iterate the Newton-Raphson equation once related to the Hellinger distance. The resulting estimator can be considered a one-step MHD estimator. We show that the proposed one-step MHD estimator has the same asymptotic behavior as the MHD estimator, as long as the initial estimators are reasonably good. Furthermore, our theoretical and numerical studies also demonstrate that the proposed one-step MHD estimator also retains excellent robustness properties of the MHD estimators. A real data example is analyzed as well.  相似文献   

9.
In this study, a generalized method of moments (GMM) for the estimation of nonstationary vector autoregressive models with cointegration is considered. Two iterative methods are considered: a simultaneous estimation method and a switching estimation method. The asymptotic properties of the GMM estimators of these methods are found to be the same as those of the Gaussian reduced-rank estimator. Through Monte Carlo simulation, the small-sample properties of the GMM estimators are studied and compared with those of the Gaussian reduced-rank estimator and the maximum likelihood estimator considered by other researchers. In the case of small samples, the GMM estimators are more robust to deviations from normality assumptions, particularly to outliers.  相似文献   

10.
The objective of this paper is to develop a robust maximum likelihood estimation (MLE) for the stochastic state space model via the expectation maximisation algorithm to cope with observation outliers. Two types of outliers and their influence are studied in this paper: namely,the additive outlier (AO) and innovative outlier (IO). Due to the sensitivity of the MLE to AO and IO, we propose two techniques for robustifying the MLE: the weighted maximum likelihood estimation (WMLE) and the trimmed maximum likelihood estimation (TMLE). The WMLE is easy to implement with weights estimated from the data; however, it is still sensitive to IO and a patch of AO outliers. On the other hand, the TMLE is reduced to a combinatorial optimisation problem and hard to implement but it is efficient to both types of outliers presented here. To overcome the difficulty, we apply the parallel randomised algorithm that has a low computational cost. A Monte Carlo simulation result shows the efficiency of the proposed algorithms.  相似文献   

11.
Advances in computing power enable more widespread use of the mode, which is a natural measure of central tendency since it is not influenced by the tails in the distribution. The properties of the half-sample mode, which is a simple and fast estimator of the mode of a continuous distribution, are studied. The half-sample mode is less sensitive to outliers than most other estimators of location, including many other low-bias estimators of the mode. Its breakdown point is one half, equal to that of the median. However, because of its finite rejection point, the half-sample mode is much less sensitive to outliers that are all either greater or less than the other values of the sample. This is confirmed by applying the mode estimator and the median to samples drawn from normal, lognormal, and Pareto distributions contaminated by outliers. It is also shown that the half-sample mode, in combination with a robust scale estimator, is a highly robust starting point for iterative robust location estimators such as Huber's M-estimator. The half-sample mode can easily be generalized to modal intervals containing more or less than half of the sample. An application of such an estimator to the finding of collision points in high-energy proton–proton interactions is presented.  相似文献   

12.
In this paper, under a semiparametric partly linear regression model with fixed design, we introduce a family of robust procedures to select the bandwidth parameter. The robust plug-in proposal is based on nonparametric robust estimates of the νth derivatives and under mild conditions, it converges to the optimal bandwidth. A robust cross-validation bandwidth is also considered and the performance of the different proposals is compared through a Monte Carlo study. We define an empirical influence measure for data-driven bandwidth selectors and, through it, we study the sensitivity of the data-driven bandwidth selectors. It appears that the robust selector compares favorably to its classical competitor, despite the need to select a pilot bandwidth when considering plug-in bandwidths. Moreover, the plug-in procedure seems to be less sensitive than the cross-validation in particular, when introducing several outliers. When combined with the three-step procedure proposed by Bianco and Boente [2004. Robust estimators in semiparametric partly linear regression models. J. Statist. Plann. Inference 122, 229-252] the robust selectors lead to robust data-driven estimates of both the regression function and the regression parameter.  相似文献   

13.
The Gaussian quasi-maximum likelihood estimator of Multivariate GARCH models is shown to be very sensitive to outliers in the data. A class of robust M-estimators for MGARCH models is developed. To increase the robustness of the estimators, the use of volatility models with the property of bounded innovation propagation is recommended. The Monte Carlo study and an empirical application to stock returns document the good robustness properties of the M-estimator with a fat-tailed Student t loss function.  相似文献   

14.
A comparative study is presented regarding the performance of commonly used estimators of the fractional order of integration when data is contaminated by noise. In particular, measurement errors, additive outliers, temporary change outliers, and structural change outliers are addressed. It occurs that when the sample size is not too large, as is frequently the case for macroeconomic data, then non-persistent noise will generally bias the estimators of the memory parameter downwards. On the other hand, relatively more persistent noise like temporary change outliers and structural changes can have the opposite effect and thus bias the fractional parameter upwards. Surprisingly, with respect to the relative performance of the various estimators, the parametric conditional maximum likelihood estimator with modelling of the short run dynamics clearly outperforms the semiparametric estimators in the presence of noise that is not too persistent. However, when a non-zero mean is allowed for, it may reverse the conclusion.  相似文献   

15.
In this paper a practical robust simulation estimator is proposed for the dynamic panel data discrete choice models using the $t$ distribution. The maximum simulated likelihood estimators are obtained through a recursive algorithm formulated by Geweke–Hajivassiliou–Keane simulators. Monte Carlo experiments indicate that the proposed robust simulation estimators perform well under the errors with longer than normal tails for a small simulation size, even with the initial conditions problem.  相似文献   

16.
This paper considers the estimation of Kendall's tau for bivariate data (X,Y) when only Y is subject to right-censoring. Although τ is estimable under weak regularity conditions, the estimators proposed by Brown et al. [1974. Nonparametric tests of independence for censored data, with applications to heart transplant studies. Reliability and Biometry, 327-354], Weier and Basu [1980. An investigation of Kendall's τ modified for censored data with applications. J. Statist. Plann. Inference 4, 381-390] and Oakes [1982. A concordance test for independence in the presence of censoring. Biometrics 38, 451-455], which are standard in this context, fail to be consistent when τ≠0 because they only use information from the marginal distributions. An exception is the renormalized estimator of Oakes [2006. On consistency of Kendall's tau under censoring. Technical Report, Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY], whose consistency has been established for all possible values of τ, but only in the context of the gamma frailty model. Wang and Wells [2000. Estimation of Kendall's tau under censoring. Statist. Sinica 10, 1199-1215] were the first to propose an estimator which accounts for joint information. Four more are developed here: the first three extend the methods of Brown et al. [1974. Nonparametric tests of independence for censored data, with applications to heart transplant studies. Reliability and Biometry, 327-354], Weier and Basu [1980, An investigation of Kendall's τ modified for censored data with applications. J. Statist. Plann. Inference 4, 381-390] and Oakes [1982, A concordance test for independence in the presence of censoring. Biometrics 38, 451-455] to account for information provided by X, while the fourth estimator inverts an estimation of Pr(Yi?y|Xi=xi,Yi>ci) to get an imputation of the value of Yi censored at Ci=ci. Following Lim [2006. Permutation procedures with censored data. Comput. Statist. Data Anal. 50, 332-345], a nonparametric estimator is also considered which averages the obtained from a large number of possible configurations of the observed data (X1,Z1),…,(Xn,Zn), where Zi=min(Yi,Ci). Simulations are presented which compare these various estimators of Kendall's tau. An illustration involving the well-known Stanford heart transplant data is also presented.  相似文献   

17.
Geometric quantiles are investigated using data collected from a complex survey. Geometric quantiles are an extension of univariate quantiles in a multivariate set-up that uses the geometry of multivariate data clouds. A very important application of geometric quantiles is the detection of outliers in multivariate data by means of quantile contours. A design-based estimator of geometric quantiles is constructed and used to compute quantile contours in order to detect outliers in both multivariate data and survey sampling set-ups. An algorithm for computing geometric quantile estimates is also developed. Under broad assumptions, the asymptotic variance of the quantile estimator is derived and a consistent variance estimator is proposed. Theoretical results are illustrated with simulated and real data.  相似文献   

18.
A finite mixture of gamma distributions [Finite mixture of certain distributions. Comm. Statist. Theory Methods 31(12), 2123-2137] is used as a conjugate prior, which gives a nice form of posterior distribution. This class of conjugate priors offers a more flexible class of priors than the class of gamma prior distributions. The usefulness of a mixture gamma-type prior and the posterior of uncertain parameters λ for the Poisson distribution are illustrated by using Markov Chain Monte Carlo (MCMC), Gibbs sampling approach, on hierarchical models. Using the generalized hypergeometric function, the method to approximate maximum likelihood estimators for the parameters of Agarwal and Al-Saleh [Generalized gamma type distribution and its hazard rate function. Comm. Statist. Theory Methods 30(2), 309-318] generalized gamma-type distribution is also suggested.  相似文献   

19.
A robust estimator for the tail index of Pareto-type distributions   总被引:1,自引:0,他引:1  
In extreme value statistics, the extreme value index is a well-known parameter to measure the tail heaviness of a distribution. Pareto-type distributions, with strictly positive extreme value index (or tail index) are considered. The most prominent extreme value methods are constructed on efficient maximum likelihood estimators based on specific parametric models which are fitted to excesses over large thresholds. Maximum likelihood estimators however are often not very robust, which makes them sensitive to few particular observations. Even in extreme value statistics, where the most extreme data usually receive most attention, this can constitute a serious problem. The problem is illustrated on a real data set from geopedology, in which a few abnormal soil measurements highly influence the estimates of the tail index. In order to overcome this problem, a robust estimator of the tail index is proposed, by combining a refinement of the Pareto approximation for the conditional distribution of relative excesses over a large threshold with an integrated squared error approach on partial density component estimation. It is shown that the influence function of this newly proposed estimator is bounded and through several simulations it is illustrated that it performs reasonably well at contaminated as well as uncontaminated data.  相似文献   

20.
In extreme value statistics, the extreme value index is a well-known parameter to measure the tail heaviness of a distribution. Pareto-type distributions, with strictly positive extreme value index (or tail index) are considered. The most prominent extreme value methods are constructed on efficient maximum likelihood estimators based on specific parametric models which are fitted to excesses over large thresholds. Maximum likelihood estimators however are often not very robust, which makes them sensitive to few particular observations. Even in extreme value statistics, where the most extreme data usually receive most attention, this can constitute a serious problem. The problem is illustrated on a real data set from geopedology, in which a few abnormal soil measurements highly influence the estimates of the tail index. In order to overcome this problem, a robust estimator of the tail index is proposed, by combining a refinement of the Pareto approximation for the conditional distribution of relative excesses over a large threshold with an integrated squared error approach on partial density component estimation. It is shown that the influence function of this newly proposed estimator is bounded and through several simulations it is illustrated that it performs reasonably well at contaminated as well as uncontaminated data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号