首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
A robust method is presented for computing rotation angles of image sequences from a set of corresponding points containing outliers. Assuming known rotation axis, a least-squares (LS) solution are derived to compute the rotation angle from a clean data set of point correspondences. Since clean data is not guaranteed, we introduce a robust solution, based on the M-estimator, to deal with outliers. Then we present an enhanced robust algorithm, called the annealing M-estimator (AM-estimator), for reliable robust estimation. The AM-estimator has several attractive advantages over the traditional M-estimator: By definition, the AM-estimator involves neither scale estimator nor free parameters and hence avoids instabilities therein. Algorithmically, it uses a deterministic annealing technique to approximate the global solution regardless of the initialization. Experimental results are presented to compare the performance of the LS, M- and AM-estimators for the angle estimation. Experiments show that in the presence of outliers, the M-estimator outperforms the LS estimator and the AM-estimator outperforms the M-estimator.  相似文献   

2.
For the first time, a five-parameter distribution, the so-called beta Burr XII distribution, is defined and investigated. The new distribution contains as special sub-models some well-known distributions discussed in the literature, such as the logistic, Weibull and Burr XII distributions, among several others. We derive its moment generating function. We obtain, as a special case, the moment generating function of the Burr XII distribution, which seems to be a new result. Moments, mean deviations, Bonferroni and Lorenz curves and reliability are provided. We derive two representations for the moments of the order statistics. The method of maximum likelihood and a Bayesian analysis are proposed for estimating the model parameters. The observed information matrix is obtained. For different parameter settings and sample sizes, various simulation studies are performed and compared in order to study the performance of the new distribution. An application to real data demonstrates that the new distribution can provide a better fit than other classical models. We hope that this generalization may attract wider applications in reliability, biology and lifetime data analysis.  相似文献   

3.
This study focuses on clustering algorithms for data on the unit hypersphere. This type of directional data lain on the surface of a unit hypersphere is used in geology, biology, meteorology, medicine and oceanography. The EM algorithm with mixtures of von Mises-Fisher distributions is often used for model-based clustering for data on the unit hypersphere. However, the EM algorithm is sensitive to initial values and outliers and a number of clusters must be assigned a priori. In this paper, we propose an effective approach, called a learning-based EM algorithm with von Mises-Fisher distributions, to cluster this type of hyper-spherical data. The proposed clustering method is robust to outliers, without the need for initialization, and automatically determines the number of clusters. Thus, it becomes a fully-unsupervised model-based clustering method for data on the unit hypersphere. Some numerical and real examples with comparisons are given to demonstrate the effectiveness and superiority of the proposed method. We also apply the proposed learning-based EM algorithm to cluster exoplanet data in extrasolar planets. The clustering results have several important implications for exoplanet data and allow an interpretation of exoplanet migration.  相似文献   

4.
The annealing robust backpropagation (ARBP) learning algorithm   总被引:2,自引:0,他引:2  
Multilayer feedforward neural networks are often referred to as universal approximators. Nevertheless, if the used training data are corrupted by large noise, such as outliers, traditional backpropagation learning schemes may not always come up with acceptable performance. Even though various robust learning algorithms have been proposed in the literature, those approaches still suffer from the initialization problem. In those robust learning algorithms, the so-called M-estimator is employed. For the M-estimation type of learning algorithms, the loss function is used to play the role in discriminating against outliers from the majority by degrading the effects of those outliers in learning. However, the loss function used in those algorithms may not correctly discriminate against those outliers. In the paper, the annealing robust backpropagation learning algorithm (ARBP) that adopts the annealing concept into the robust learning algorithms is proposed to deal with the problem of modeling under the existence of outliers. The proposed algorithm has been employed in various examples. Those results all demonstrated the superiority over other robust learning algorithms independent of outliers. In the paper, not only is the annealing concept adopted into the robust learning algorithms but also the annealing schedule k/t was found experimentally to achieve the best performance among other annealing schedules, where k is a constant and t is the epoch number.  相似文献   

5.
In the presence of a heavy-tail noise distribution, regression becomes much more difficult. Traditional robust regression methods assume that the noise distribution is symmetric, and they downweight the influence of so-called outliers. When the noise distribution is asymmetric, these methods yield biased regression estimators. Motivated by data-mining problems for the insurance industry, we propose a new approach to robust regression tailored to deal with asymmetric noise distribution. The main idea is to learn most of the parameters of the model using conditional quantile estimators (which are biased but robust estimators of the regression) and to learn a few remaining parameters to combine and correct these estimators, to minimize the average squared error in an unbiased way. Theoretical analysis and experiments show the clear advantages of the approach. Results are on artificial data as well as insurance data, using both linear and neural network predictors.  相似文献   

6.
Advances in computing power enable more widespread use of the mode, which is a natural measure of central tendency since it is not influenced by the tails in the distribution. The properties of the half-sample mode, which is a simple and fast estimator of the mode of a continuous distribution, are studied. The half-sample mode is less sensitive to outliers than most other estimators of location, including many other low-bias estimators of the mode. Its breakdown point is one half, equal to that of the median. However, because of its finite rejection point, the half-sample mode is much less sensitive to outliers that are all either greater or less than the other values of the sample. This is confirmed by applying the mode estimator and the median to samples drawn from normal, lognormal, and Pareto distributions contaminated by outliers. It is also shown that the half-sample mode, in combination with a robust scale estimator, is a highly robust starting point for iterative robust location estimators such as Huber's M-estimator. The half-sample mode can easily be generalized to modal intervals containing more or less than half of the sample. An application of such an estimator to the finding of collision points in high-energy proton–proton interactions is presented.  相似文献   

7.
An adaptive robust M-estimator for nonparametric nonlinear system identification is proposed. This M-estimator is optimal over a broad class of distributions in the sense of maximum likelihood estimation. The error distributions are described by the generalized exponential distribution family. It combines non-parametric regression techniques to form a powerful procedure for nonlinear system identification. The adaptive procedure's excellent performance characteristics are illustrated in a Monte Carlo study by comparing the results with previous methods.  相似文献   

8.
In this paper we propose an extension of the three-parameter Burr III distribution with the consideration of both theoretical and practical reasons. The research is motivated by low-flow frequency analysis in water resources research. Three commonly used parameter estimation methods were evaluated, including the method of moments, probability-weighted moments (or L-moments) and maximum likelihood method. The computing issues in the parameter estimation are also discussed. The performance of the proposed distribution is examined using a simulation study and real data from large number of catchments from Australia.  相似文献   

9.
This correspondence introduces a new orthogonal forward regression (OFR) model identification algorithm using D-optimality for model structure selection and is based on an M-estimators of parameter estimates. M-estimator is a classical robust parameter estimation technique to tackle bad data conditions such as outliers. Computationally, The M-estimator can be derived using an iterative reweighted least squares (IRLS) algorithm. D-optimality is a model structure robustness criterion in experimental design to tackle ill-conditioning in model structure. The orthogonal forward regression (OFR), often based on the modified Gram-Schmidt procedure, is an efficient method incorporating structure selection and parameter estimation simultaneously. The basic idea of the proposed approach is to incorporate an IRLS inner loop into the modified Gram-Schmidt procedure. In this manner, the OFR algorithm for parsimonious model structure determination is extended to bad data conditions with improved performance via the derivation of parameter M-estimators with inherent robustness to outliers. Numerical examples are included to demonstrate the effectiveness of the proposed algorithm.  相似文献   

10.
An adaptive trimmed mean estimator for symmetric and asymmetric distributions has been developed. The simulation sample size as small as n = 10, 25 and 50 are chosen, where the influence of the outliers increases in small sample data. The comparison of the proposed estimator and the Princeton Robust Study are made. The results show that the proposed mean estimator has lower standard deviation for symmetric distributions and has a better location estimation for asymmetric distributions.  相似文献   

11.
In survival analysis applications, the failure rate function may frequently present a unimodal shape. In such case, the log-normal or log-logistic distributions are used. In this paper, we shall be concerned only with parametric forms, so a location-scale regression model based on the Burr XII distribution is proposed for modeling data with a unimodal failure rate function as an alternative to the log-logistic regression model. Assuming censored data, we consider a classic analysis, a Bayesian analysis and a jackknife estimator for the parameters of the proposed model. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the log-logistic and log-Burr XII regression models. Besides, we use sensitivity analysis to detect influential or outlying observations, and residual analysis is used to check the assumptions in the model. Finally, we analyze a real data set under log-Burr XII regression models.  相似文献   

12.
郭威  徐涛 《控制与决策》2023,38(4):1039-1046
宽度学习系统(BLS)是最近提出的一种准确且高效的新兴机器学习算法,已在分类、回归等问题中展现出优越的学习性能.然而,传统BLS以最小二乘作为学习准则,易受到离群值的干扰从而生成不准确的学习模型.鉴于此,提出一种基于M-estimator的鲁棒宽度学习系统(RBLS).与BLS不同, RBLS在学习模型中使用具有鲁棒特性的M-estimator代价函数替代传统的最小二乘代价函数,并采用拉格朗日乘子法和迭代加权最小二乘方法进行优化求解.在迭代学习过程中,正常样本和离群值样本将根据其训练误差的大小而被逆向赋予不同的权重,从而有效地抑制或消除离群值误差对学习模型的不利影响.实验结果表明,作为一种统一的鲁棒学习框架, RBLS可以融合使用不同的M-estimator加权策略,且能够取得更好的泛化性能和鲁棒性.  相似文献   

13.
离群点检测的目标是识别数据集中与其他样本明显不同的个体,以便检测数据中的异常或异常状态。现有的方法难以有效应对复杂、非线性分布的数据,并且面临参数敏感性和数据分布多样性的问题。为此,现提出一种新型图结构——自适应邻居图,以边为导向,通过迭代的方式对数据进行特征提取,并计算近邻可达度对离群点进行识别,减小了参数的影响,同时可适用于不同分布类型的数据。为了充分验证其性能,将该方法在多个合成与真实数据集上同其他方法进行了比较分析。实验结果表明,该方法在所有19个数据集中平均排名第一,在保持高精度的同时表现出稳定性。  相似文献   

14.
For real-world applications, the obtained data are always subject to noise or outliers. The learning mechanism of cerebellar model articulation controller (CMAC), a neurological model, is to imitate the cerebellum of human being. CMAC has an attractive property of learning speed in which a small subset addressed by the input space determines output instantaneously. For fuzzy cerebellar model articulation controller (FCMAC), the concept of fuzzy is incorporated into CMAC to improve the accuracy problem. However, the distributions of errors into the addressed hypercubes may cause unacceptable learning performance for input data with noise or outliers. For robust fuzzy cerebellar model articulation controller (RFCMAC), the robust learning of M-estimator can be embedded into FCMAC to degrade noise or outliers. Meanwhile, support vector machine (SVR) is a machine learning theory based algorithm which has been applied successfully to a number of regression problems when noise or outliers exist. Unfortunately, the practical application of SVR is limited to defining a set of parameters for obtaining admirable performance by the user. In this paper, a robust learning algorithm based on support SVR and RFCMAC is proposed. The proposed algorithm has both the advantage of SVR, the ability to avoid corruption effects, and the advantage of RFCMAC, the ability to obtain attractive properties of learning performance and to increase accurate approximation. Additionally, particle swarm optimization (PSO) is applied to obtain the best parameters setting for SVR. From simulation results, it shows that the proposed algorithm outperforms other algorithms.  相似文献   

15.
Measurement error models often arise in epidemiological and clinical research. Usually, in this set up it is assumed that the latent variable has a normal distribution. However, the normality assumption may not be always correct. Skew-normal/independent distribution is a class of asymmetric thick-tailed distributions which includes the skew-normal distribution as a special case. In this paper, we explore the use of skew-normal/independent distribution as a robust alternative to null intercept measurement error model under a Bayesian paradigm. We assume that the random errors and the unobserved value of the covariate (latent variable) follows jointly a skew-normal/independent distribution, providing an appealing robust alternative to the routine use of symmetric normal distribution in this type of model. Specific distributions examined include univariate and multivariate versions of the skew-normal distribution, the skew-t distributions, the skew-slash distributions and the skew contaminated normal distributions. The methods developed is illustrated using a real data set from a dental clinical trial.  相似文献   

16.
The Gaussian quasi-maximum likelihood estimator of Multivariate GARCH models is shown to be very sensitive to outliers in the data. A class of robust M-estimators for MGARCH models is developed. To increase the robustness of the estimators, the use of volatility models with the property of bounded innovation propagation is recommended. The Monte Carlo study and an empirical application to stock returns document the good robustness properties of the M-estimator with a fat-tailed Student t loss function.  相似文献   

17.
在机器学习理论与应用中,特征选择是降低高维数据特征维度的常用方法之一。传统的特征选择方法多数基于完整数据集,对实际应用中普遍存在缺失数据的情形研究较少。针对不完整数据中含有未被观察信息和存在异常值的特点,提出一种基于概率矩阵分解技术的鲁棒特征选择方法。使用基于分簇的概率矩阵分解模型对数据集中的缺失值进行近似估计,以有效测量相邻簇之间数据的相似性,缩小问题规模,同时降低填充误差。依据缺失数据值存在少量异常值的情形,利用基于l2,1损失函数的方法进行特征选择,在此基础上给出不完整数据集的特征选择方法流程,并对其收敛性进行理论分析。该方法利用不完整数据集中的所有信息,有效应对不完整数据集中异常值带来的影响。实验结果表明,相比传统特征选择方法,该方法在合成数据集上选择更少的无关特征,可降低异常值带来的影响,在真实数据集上获得了较高的分类准确率,能够选择出更为准确的特征。  相似文献   

18.
The present paper develops an outlier model suitable for problems wherein identification of outliers is essential and, applied areas of statistics are abound with such examples. One of the peculiarities of outliers in survey sampling is that there could be observed as well as unobserved outliers; the paper assumes that there are no unobserved outliers. We use a generalized linear model (GLM) with higher variances for the outlying units. Count data are treated through overdispersed GLM of Gelfand and Dalal (1990). Error components of the link function are assumed to have scale mixtures of normal distributions. The framework covers both standard survey sampling and small area estimation problems. The number as well as the set of outliers are assumed to be unknown. Posterior joint distribution is found using the reversible jump Markov chain and Metropolis-Hastings algorithm. We also use properties of the deviance function of GLM (West, 1985) for posterior computations. The basic framework is extended to various models appropriate in survey sampling such as double sampling and conditional autoregressive models. The method is illustrated using leukaemia patients data of Cox and Snell (1981), Scottish lip cancer data, Missouri lung cancer data and Baltimore census data.  相似文献   

19.
The Grubbs’ measurement model is frequently used to compare several measuring devices. It is common to assume that the random terms have a normal distribution. However, such assumption makes the inference vulnerable to outlying observations, whereas scale mixtures of normal distributions have been an interesting alternative to produce robust estimates, keeping the elegancy and simplicity of the maximum likelihood theory. The aim of this paper is to develop an EM-type algorithm for the parameter estimation, and to use the local influence method to assess the robustness aspects of these parameter estimates under some usual perturbation schemes. In order to identify outliers and to criticize the model building we use the local influence procedure in a study to compare the precision of several thermocouples.  相似文献   

20.
To accurately analyze behavior of mechanical system, accurate statistical modeling of input variables is necessary by identifying probabilistic distributions of input variables. These distributions are generally determined by applying goodness-of-fit (GOF) tests or model selection methods to the given data on the input variables. However, GOF tests only accept or reject the hypothesis that a candidate distribution is appropriate to represent the given data. The model selection methods determine the best-fit distribution for the given data among various candidate distributions but do not provide any information about the adequacy of using the identified distribution to represent the given data. Therefore, in this paper, a sequential statistical modeling (SSM) method is proposed. The SSM method uses a GOF test to select appropriate candidate distributions from among all possible distributions and then identifies the best-fit distribution from among the selected candidate distributions using a model selection method. The adequacy of the identified best-fit distribution is verified by using an area metric that measures the intersection area between the probability density function (PDF) of the best-fit distribution and the data distribution. This metric can be used to analyze the similarities between the PDFs of the candidate distributions. In statistical simulation tests, it was observed that the SSM method identified correct distributions more accurately and conservatively than the GOF tests or model selection methods alone.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号