期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An evolutionary algorithm for robust regression

Robin Nunkesser Oliver Morell 《Computational statistics & data analysis》2010,54(12):3242-3248

A drawback of robust statistical techniques is the increased computational effort often needed as compared to non-robust methods. Particularly, robust estimators possessing the exact fit property are NP-hard to compute. This means that—under the widely believed assumption that the computational complexity classes NP and P are not equal—there is no hope to compute exact solutions for large high dimensional data sets. To tackle this problem, search heuristics are used to compute NP-hard estimators in high dimensions. A new evolutionary algorithm that is applicable to different robust estimators is presented. Further, variants of this evolutionary algorithm for selected estimators—most prominently least trimmed squares and least median of squares—are introduced and shown to outperform existing popular search heuristics in difficult data situations. The results increase the applicability of robust methods and underline the usefulness of evolutionary algorithms for computational statistics. 相似文献

2.

Weighted and robust archetypal analysis

Manuel J.A. Eugster Friedrich Leisch 《Computational statistics & data analysis》2011,55(3):1215-1225

Archetypal analysis represents observations in a multivariate data set as convex combinations of a few extremal points lying on the boundary of the convex hull. Data points which vary from the majority have great influence on the solution; in fact one outlier can break down the archetype solution. The original algorithm is adapted to be a robust M-estimator and an iteratively reweighted least squares fitting algorithm is presented. As a required first step, the weighted archetypal problem is formulated and solved. The algorithm is demonstrated using an artificial example, a real world example and a detailed simulation study. 相似文献

3.

Some edge correction methods for marked spatio-temporal point process models

Ottmar Cronie Aila Särkkä 《Computational statistics & data analysis》2011,55(7):2209-2220

Three edge correction methods for (marked) spatio-temporal point processes are proposed. They are all based on the idea of placing an approximated expected behaviour of the process at hand (simulated realisations) outside the study region which interacts with the data during the estimation. These methods are applied to the so-called growth-interaction model. The specific choices of growth function and interaction function made are purely motivated by the forestry applications considered. The parameters of the growth and interaction functions, i.e. the parameters related to the development of the marks, are estimated using the least-squares approach together with the proposed edge corrections. Finally, the edge corrected estimation methods are applied to a data set of Swedish Scots pine. 相似文献

4.

On a fast, robust estimator of the mode: Comparisons to other robust estimators with applications

David R. Bickel Rudolf Frühwirth 《Computational statistics & data analysis》2006,50(12):3500-3530

Advances in computing power enable more widespread use of the mode, which is a natural measure of central tendency since it is not influenced by the tails in the distribution. The properties of the half-sample mode, which is a simple and fast estimator of the mode of a continuous distribution, are studied. The half-sample mode is less sensitive to outliers than most other estimators of location, including many other low-bias estimators of the mode. Its breakdown point is one half, equal to that of the median. However, because of its finite rejection point, the half-sample mode is much less sensitive to outliers that are all either greater or less than the other values of the sample. This is confirmed by applying the mode estimator and the median to samples drawn from normal, lognormal, and Pareto distributions contaminated by outliers. It is also shown that the half-sample mode, in combination with a robust scale estimator, is a highly robust starting point for iterative robust location estimators such as Huber's M-estimator. The half-sample mode can easily be generalized to modal intervals containing more or less than half of the sample. An application of such an estimator to the finding of collision points in high-energy proton–proton interactions is presented. 相似文献

5.

Asymptotic properties of least squares estimation with fuzzy observations

Hae Kyung Kim Jin Hee Yoon Ying Li 《Information Sciences》2008,178(2):439-451

This paper deals with the asymptotic properties of the least squares estimators for fuzzy linear regression models with fuzzy triangular input-output and random error terms. The asymptotic normality and strong consistency of the fuzzy least squares estimator (FLSE) are investigated; a confidence region based on a class of FLSEs is proposed; the asymptotic relative efficiency of FLSEs with respect to the crisp least squares estimators is also provided and a numerical example is given. Some simulation results are also presented to illustrate the behavior of FLSEs. 相似文献

6.

Multivariate discount weighted regression and local level models

Kostas Triantafyllopoulos 《Computational statistics & data analysis》2006,50(12):3702-3720

The technique of multivariate discount weighted regression is used for forecasting multivariate time series. In particular, the discount regression model is modified to cater for the popular local level model for predicting vector time series. The proposed methodology is illustrated with London metal exchange data consisting of aluminium spot and future contract closing prices. The estimate of the measurement noise covariance matrix suggests that these data exhibit high cross-correlation, which is discussed in some detail. The performance of the proposed model is evaluated via an error analysis based on the mean of squared forecast errors, the mean of absolute forecast errors and the mean of absolute percentage forecast errors. A sensitivity analysis shows that a low discount factor should be used and practical guidelines are given for general future use. 相似文献

7.

Local polynomial estimation in partial linear regression models under dependence

G. Aneiros-Pérez J.M. Vilar-Fernández 《Computational statistics & data analysis》2008,52(5):2757-2777

A regression model whose regression function is the sum of a linear and a nonparametric component is presented. The design is random and the response and explanatory variables satisfy mixing conditions. A new local polynomial type estimator for the nonparametric component of the model is proposed and its asymptotic normality is obtained. Specifically, this estimator works on a prewhitening transformation of the dependent variable, and the results show that it is asymptotically more efficient than the conventional estimator (which works on the original dependent variable) when the errors of the model are autocorrelated. A simulation study and an application to a real data set give promising results. 相似文献

8.

AREION: Software effort estimation based on multiple regressions with adaptive recursive data partitioning

《Information and Software Technology》2013,55(10):1710-1725

ContextAlong with expert judgment, analogy-based estimation, and algorithmic methods (such as Function point analysis and COCOMO), Least Squares Regression (LSR) has been one of the most commonly studied software effort estimation methods. However, an effort estimation model using LSR, a single LSR model, is highly affected by the data distribution. Specifically, if the data set is scattered and the data do not sit closely on the single LSR model line (do not closely map to a linear structure) then the model usually shows poor performance. In order to overcome this drawback of the LSR model, a data partitioning-based approach can be considered as one of the solutions to alleviate the effect of data distribution. Even though clustering-based approaches have been introduced, they still have potential problems to provide accurate and stable effort estimates.ObjectiveIn this paper, we propose a new data partitioning-based approach to achieve more accurate and stable effort estimates via LSR. This approach also provides an effort prediction interval that is useful to describe the uncertainty of the estimates.MethodEmpirical experiments are performed to evaluate the performance of the proposed approach by comparing with the basic LSR approach and clustering-based approaches, based on industrial data sets (two subsets of the ISBSG (Release 9) data set and one industrial data set collected from a banking institution).ResultsThe experimental results show that the proposed approach not only improves the accuracy of effort estimation more significantly than that of other approaches, but it also achieves robust and stable results according to the degree of data partitioning.ConclusionCompared with the other considered approaches, the proposed approach shows a superior performance by alleviating the effect of data distribution that is a major practical issue in software effort estimation. 相似文献

9.

A geometric approach for adaptive estimation of unknown growth kinetics in bioreactors

《Journal of Process Control》2014,24(10):1496-1503

This paper proposes a new approach for the estimation of unknown and time-varying specific growth rate in fed-batch bioprocess. A novel adaptive estimation technique based on the concept of invariant manifold is proposed as an effective approach to estimate growth kinetic parameters. An asymptotic nonlinear observer is used to provide simultaneous on-line estimation of biomass concentration and growth kinetic. The method is easy to implement and requires only one tuning parameter. The effectiveness of the proposed algorithm is illustrated with representative bioreactor simulation examples. 相似文献

10.

Matrix strategies for computing the least trimmed squares estimation of the general linear and SUR models

Marc Hofmann Erricos John Kontoghiorghes 《Computational statistics & data analysis》2010,54(12):3392-3403

An algorithm for computing the exact least trimmed squares (LTS) estimator of the standard regression model has recently been proposed. The LTS algorithm is adapted to the general linear and seemingly unrelated regressions models with possible singular dispersion matrices. It searches through a regression tree to find the optimal estimates and has combinatorial complexity. The model is formulated as a generalized linear least squares problem. Efficient matrix techniques are employed to update the generalized residual sum of squares of a subset model. Specifically, the new algorithm utilizes previous computations to update a generalized QR decomposition by a single row. The sparse structure of the model is exploited. Theoretical measures of computational complexity are provided. Experimental results confirm the ability of the new algorithms to identify outlying observations. 相似文献

11.

The theoretic framework of local weighted approximation for microarray missing value estimation

Chao-Chun Liu Author Vitae Dao-Qing Dai Author Vitae Hong Yan Author Vitae 《Pattern recognition》2010,43(8):2993-3002

Microarray data are used in many biomedical experiments. They often contain missing values which significantly affect statistical algorithms. Although a number of imputation algorithms have been proposed, they have various limitations to exploit local and global information effectively for estimation. It is necessary to develop more effective techniques to solve the data imputation problem. In this paper, we propose a theoretic framework of local weighted approximation for missing value estimation, based on the Taylor series approximation. Besides revealing that k-nearest neighbor imputation (KNNimpute) is a special case of the framework, we focus on the study of its linear case—local weighted linear approximation imputation (LWLAimpute) from theory to experiment. Experimental results show that LWLAimpute and its iterative version can achieve better performance than some existing imputation methods, the superiority becomes more significant with increasing level of missing values. 相似文献

12.

Spreadsheet template approach for nonlinear regression estimation

W.R. Terry K.W. Cutright W.J. Herald 《Computers & Industrial Engineering》1986,11(1-4):335-339

Industrial Engineers may encounter variables which are nonlinearly related such that the relationship cannot be transformed to one which is linear in the unknown parameters. One instance where this can occur is a learning curve which approaches a non-zero asymptote. This paper presents a spreadsheet template for implementing the Gauss-Newton Method for solving such nonlinear regression estimation problems. 相似文献

13.

A continuous-time framework for least squares parameter estimation

《Automatica》2014,50(12):3276-3280

This paper proposes a continuous-time framework for the least-squares parameter estimation method through evolution equations. Nonlinear systems in the standard state space representation that are linear in the unknown, constant parameters are investigated. Two estimators are studied. The first one consists of a linear evolution equation while the second one consists of an impulsive linear evolution equation. The paper discusses some theoretical aspects related to the proposed estimators: uniqueness of a solution and an attractive equilibrium point which solves for the unknown parameters. A deterministic framework for the estimation under noisy measurements is proposed using a Sobolev space with negative index to model the noise. The noise can be of large magnitude. Concrete signals issued from an electronic device are used to discuss numerical aspects. 相似文献

14.

A weighted quantile regression for randomly truncated data

Weihua Zhou 《Computational statistics & data analysis》2011,55(1):554-566

Quantile regression offers great flexibility in assessing covariate effects on the response. In this article, based on the weights proposed by He and Yang (2003), we develop a new quantile regression approach for left truncated data. Our method leads to a simple algorithm that can be conveniently implemented with R software. It is shown that the proposed estimator is strongly consistent and asymptotically normal under appropriate conditions. We evaluate the finite sample performance of the proposed estimators through extensive simulation studies. 相似文献

15.

Using symmetry in robust model fitting 总被引：1，自引：0，他引：1

Hanzi Wang David Suter 《Pattern recognition letters》2003,24(16):2953-2966

The pattern recognition and computer vision communities often employ robust methods for model fitting. In particular, high breakdown-point methods such as least median of squares (LMedS) and least trimmed squares (LTS) have often been used in situations where the data are contaminated with outliers. However, though the breakdown point of these methods can be as high as 50% (they can be robust to up to 50% contamination), they can break down at unexpectedly lower percentages when the outliers are clustered. In this paper, we demonstrate the fragility of LMedS and LTS and analyze the reasons that cause the fragility of these methods in the situation when a large percentage of clustered outliers exist in the data. We adapt the concept of “symmetry distance” to formulate an improved regression method, called the least trimmed symmetry distance (LTSD). Experimental results are presented to show that the LTSD performs better than LMedS and LTS under a large percentage of clustered outliers and large standard variance of inliers. 相似文献

16.

Analysis of the Kalman filter based estimation algorithm: an orthogonal decomposition approach

Liyu Cao Author Vitae 《Automatica》2004,40(1):5-19

In this paper we shall provide new analysis on some fundamental properties of the Kalman filter based parameter estimation algorithms using an orthogonal decomposition approach based on the excited subspace. A theoretical analytical framework is established based on the decomposition of the covariance matrix, which appears to be very useful and effective in the analysis of a parameter estimation algorithm with the existence of an unexcited subspace. The sufficient and necessary condition for the boundedness of the covariance matrix in the Kalman filter is established. The idea of directional tracking is proposed to develop a new class of algorithms to overcome the windup problem. Based on the orthogonal decomposition approach two kinds of directional tracking algorithms are proposed. These algorithms utilize a time-varying covariance matrix and can keep stable even in the case of unsufficient and/or unbounded excitation. 相似文献

17.

Outliers, inliers and the generalized least trinuned squares estimator in system identification

Erwei BAI 《控制理论与应用(英文版)》2003,1(1):17-27

The least trimmed squares estimator (LTS) is a well known robust estimator in terms of protecting the estimate from the outliers. Its high computational complexity is however a problem in practice. We show that the LTS estimate can be obtained by a simple algorithm with the complexity O( N In N) for large N, where N is the number of measurements. We also show that though the LTS is robust in terms of the outliers, it is sensitive to the inliers. The concept of the inliers is introduced. Moreover, the Generalized Least Trimmed Squares estimator (GLTS) together with its solution are presented that reduces the effect of both the outliers and the inliers. 相似文献

18.

Combined parameter and output estimation of dual-rate systems using an auxiliary model 总被引：12，自引：0，他引：12

Feng Ding Author Vitae Tongwen Chen Author Vitae 《Automatica》2004,40(10):1739-1748

For a dual-rate sampled-data system, an auxiliary model based identification algorithm for combined parameter and output estimation is proposed. The basic idea is to use an auxiliary model to estimate the unknown noise-free output (true output) of the system, and directly to identify the parameters of the underlying fast single-rate model from the dual-rate input-output data. It is shown that the parameter estimation error consistently converges to zero under generalized or weak persistent excitation conditions and unbounded noise variance, and that the output estimates uniformly converge to the true outputs. An example is included. 相似文献

19.

Optimal multilinear estimation of a random vector under constraints of causality and limited memory

P.G. Howlett 《Computational statistics & data analysis》2007,52(2):869-878

A new technique is provided for random vector estimation from noisy data under the constraints that the estimator is causal and dependent on at most a finite number p of observations. Nonlinear estimators defined by multilinear operators of degree r are employed, the choice of r allowing a trade-off between the accuracy of the optimal filter and the complexity of the calculations. The techniques utilise an exact correspondence of the nonlinear problem to a corresponding linear one. This is then solved by a new procedure, the least squares singular pivot algorithm, whereby the linear problem can be repeated reduced to smaller structurally similar problems. Invertibility of the relevant covariance matrices is not assumed. Numerical experiments with real data are used to illustrate the efficacy of the new algorithm. 相似文献

20.

Outliers, inliers and the generalized least trimmed squares estimator in system identification

ErweiBAI 《控制理论与应用(英文版)》2003,1(1):17-27

The least trimmed squares estimator (LTS) is a well known robust estinaator in terms of protecting the estimatefrom the outliers. Its high computational complexity is however a problem in practice. We show that the LTS estimate can be obtained by a simple algorithm with the complexity O( N In N) for large N, where N is the number of measurements. We also showthat though the LTS is robust in terms of the outliers, it is sensitive to the inliers. The concept of the inliers is introduced. Moreover, the Generalized Least Trimmed Squares estimator (GLTS) together with its solution are presented that reduces the effect of both the outliers and the inliers. 相似文献