首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Five robustifications of L2 boosting for linear regression with various robustness properties are considered. The first two use the Huber loss as implementing loss function for boosting and the second two use robust simple linear regression for the fitting in L2 boosting (i.e. robust base learners). Both concepts can be applied with or without down-weighting of leverage points. Our last method uses robust correlation estimates and appears to be most robust. Crucial advantages of all methods are that they do not compute covariance matrices of all covariates and that they do not have to identify multivariate leverage points. When there are no outliers, the robust methods are only slightly worse than L2 boosting. In the contaminated case though, the robust methods outperform L2 boosting by a large margin. Some of the robustifications are also computationally highly efficient and therefore well suited for truly high-dimensional problems.  相似文献   

2.
This study proposes a hybrid robust approach for constructing Takagi–Sugeno–Kang (TSK) fuzzy models with outliers. The approach consists of a robust fuzzy C-regression model (RFCRM) clustering algorithm in the coarse-tuning phase and an annealing robust back-propagation (ARBP) learning algorithm in the fine-tuning phase. The RFCRM clustering algorithm is modified from the fuzzy C-regression models (FCRM) clustering algorithm by incorporating a robust mechanism and considering input data distribution and robust similarity measure into the FCRM clustering algorithm. Due to the use of robust mechanisms and the consideration of input data distribution, the fuzzy subspaces and the parameters of functions in the consequent parts are simultaneously identified by the proposed RFCRM clustering algorithm and the obtained model will not be significantly affected by outliers. Furthermore, the robust similarity measure is used in the clustering process to reduce the redundant clusters. Consequently, the RFCRM clustering algorithm can generate a better initialization for the TSK fuzzy models in the coarse-tuning phase. Then, an ARBP algorithm is employed to obtain a more precise model in the fine-tuning phase. From our simulation results, it is clearly evident that the proposed robust TSK fuzzy model approach is superior to existing approaches in learning speed and in approximation accuracy.  相似文献   

3.
Support vector regression (SVR) is now a well-established method for estimating real-valued functions. However, the standard SVR is not effective to deal with severe outlier contamination of both response and predictor variables commonly encountered in numerous real applications. In this paper, we present a bounded influence SVR, which downweights the influence of outliers in all the regression variables. The proposed approach adopts an adaptive weighting strategy, which is based on both a robust adaptive scale estimator for large regression residuals and the statistic of a “kernelized” hat matrix for leverage point removal. Thus, our algorithm has the ability to accurately extract the dominant subset in corrupted data sets. Simulated linear and nonlinear data sets show the robustness of our algorithm against outliers. Last, chemical and astronomical data sets that exhibit severe outlier contamination are used to demonstrate the performance of the proposed approach in real situations.   相似文献   

4.
Usually, in the regression models, the data are contaminated with unusually observations (outliers). For that reason the last 30 years have developed robust regression estimators. Among them some of the most famous are Least Trimmed Squares (LTS), MM, Penalized Trimmed Square (PTS) and others. Most of these methods, especially PTS, are based on initial leverage, concerning x outlying observations, of the data sample. However, often, multiple x-outliers pull the distance towards their value, causing leverage bias, and this is the masking problem.In this work we develop a new algorithm for robust leverage estimate based on Least Trimmed Euclidean Deviations (LTED). Extensive computational, Monte-Carlo simulations, with varying types of outliers and degrees of contamination, indicate that the LTED procedure identifies successfully the multiple outliers, and the resulting robust leverage improves significantly the PTS performance.  相似文献   

5.
Robust TSK fuzzy modeling for function approximation with outliers   总被引:3,自引:0,他引:3  
The Takagi-Sugeno-Kang (TSK) type of fuzzy models has attracted a great attention of the fuzzy modeling community due to their good performance in various applications. Most approaches for modeling TSK fuzzy rules define their fuzzy subspaces based on the idea of training data being close enough instead of having similar functions. Besides, training data sets algorithms often contain outliers, which seriously affect least-square error minimization clustering and learning algorithms. A robust TSK fuzzy modeling approach is presented. In the approach, a clustering algorithm termed as robust fuzzy regression agglomeration (RFRA) is proposed to define fuzzy subspaces in a fuzzy regression manner with robust capability against outliers. To obtain a more precision model, a robust fine-tuning algorithm is then employed. Various examples are used to verify the effectiveness of the proposed approach. From the simulation results, the proposed robust TSK fuzzy modeling indeed showed superior performance over other approaches  相似文献   

6.
In this paper, we propose a robust Kalman filter and smoother for the errors‐in‐variables (EIV) state space models subject to observation noise with outliers. We introduce the EIV problem with outliers and then present the minimum covariance determinant (MCD) estimator which is a highly robust estimator in terms of protecting the estimate from the outliers. Then, we propose the randomized algorithm to find the MCD estimate. However, the uniform sampling method has a high computational cost and may lead to biased estimates, therefore we apply the sub‐sampling method. A Monte Carlo simulation result shows the efficiency of the proposed algorithm. Copyright © 2011 John Wiley and Sons Asia Pte Ltd and Chinese Automatic Control Society  相似文献   

7.
The annealing robust backpropagation (ARBP) learning algorithm   总被引:2,自引:0,他引:2  
Multilayer feedforward neural networks are often referred to as universal approximators. Nevertheless, if the used training data are corrupted by large noise, such as outliers, traditional backpropagation learning schemes may not always come up with acceptable performance. Even though various robust learning algorithms have been proposed in the literature, those approaches still suffer from the initialization problem. In those robust learning algorithms, the so-called M-estimator is employed. For the M-estimation type of learning algorithms, the loss function is used to play the role in discriminating against outliers from the majority by degrading the effects of those outliers in learning. However, the loss function used in those algorithms may not correctly discriminate against those outliers. In the paper, the annealing robust backpropagation learning algorithm (ARBP) that adopts the annealing concept into the robust learning algorithms is proposed to deal with the problem of modeling under the existence of outliers. The proposed algorithm has been employed in various examples. Those results all demonstrated the superiority over other robust learning algorithms independent of outliers. In the paper, not only is the annealing concept adopted into the robust learning algorithms but also the annealing schedule k/t was found experimentally to achieve the best performance among other annealing schedules, where k is a constant and t is the epoch number.  相似文献   

8.
The two most commonly used types of artificial neural networks (ANNs) are the multilayer feed-forward and multiplicative neuron model ANNs. In the literature, although there is a robust learning algorithm for the former, there is no such algorithm for the latter. Because of its multiplicative structure, the performance of multiplicative neuron model ANNs is affected negatively when the dataset has outliers. On this issue, a robust learning algorithm for the multiplicative neuron model ANNs is proposed that uses Huber's loss function as fitness function. The training of the multiplicative neuron model is performed using particle swarm optimization. One principle advantage of this algorithm is that the parameter of the scale estimator, which is an important factor affecting the value of Huber's loss function, is also estimated with the proposed algorithm. To evaluate the performance of the proposed method, it is applied to two well-known real world time series datasets, and also a simulation study is performed. The algorithm has superior performance both when it is applied to real world time series datasets and the simulation study when compared with other ANNs reported in the literature. Another of its advantages is that, for datasets with outliers, the results are very close to the results obtained from the original datasets. In other words, we demonstrate that the algorithm is unaffected by outliers and has a robust structure.  相似文献   

9.
Yang  Yang  Tao  Zhenghang  Qian  Chen  Gao  Yuchao  Zhou  Hu  Ding  Zhe  Wu  Jinran 《Applied Intelligence》2022,52(2):1630-1652

Electric load forecasting has become crucial to the safe operation of power grids and cost reduction in the production of power. Although numerous electric load forecasting models have been proposed, most of them are still limited by poor effectiveness in the model training and a sensitivity to outliers. The limitations of current methods may lead to extra operational costs of a power system or even disrupt its power distribution and network safety. To this end, we propose a new hybrid load-forecasting model, which is based on a robust extreme-learning machine and an improved whale optimization algorithm. Specifically, Huber loss, which is insensitive to outliers, is proposed as the objective function in extreme learning machine (ELM) training. In addition, an improved whale optimization algorithm is designed for the robust ELM training, in which a cellular automaton mechanism is used to enhance the local search. To verify our improved whale optimization algorithm, some experiments were then conducted based on seven benchmark test functions. Due to the enhancement of the local search, the improved optimizer was around 7% superior to the basic. Finally, our proposed hybrid forecasting model was validated by two real electric load datasets (Nanjing and New South Wales), and the experimental results confirmed that the proposed hybrid load-forecasting model could achieve satisfying improvements in both datasets.

  相似文献   

10.
求解非线性回归问题的Newton算法   总被引:1,自引:0,他引:1  
针对大规模非线性回归问题,提出基于静态储备池的Newton算法.利用储备池搭建高维特征空间,将原始问题转化成与储备池维数相关的线性支持向量回归问题,并应用Newton算法求解.鲁棒损失函数的应用可抑制异常点对预测结果的干扰.通过与SVR(Support Vector Regression)及储备池Tikhonov正则化方法比较,验证了所提方法的快速性、较高的预测精度和较好的鲁棒性.  相似文献   

11.
Noise sensitivity is known as a key related issue of AdaBoost algorithm. Previous works exhibit that AdaBoost is prone to be overfitting in dealing with the noisy data sets due to its consistent high weights assignment on hard-to-learn instances (mislabeled instances or outliers). In this paper, a new boosting approach, named noise-detection based AdaBoost (ND-AdaBoost), is exploited to combine classifiers by emphasizing on training misclassified noisy instances and correctly classified non-noisy instances. Specifically, the algorithm is designed by integrating a noise-detection based loss function into AdaBoost to adjust the weight distribution at each iteration. A k-nearest-neighbor (k-NN) and an expectation maximization (EM) based evaluation criteria are both constructed to detect noisy instances. Further, a regeneration condition is presented and analyzed to control the ensemble training error bound of the proposed algorithm which provides theoretical support. Finally, we conduct some experiments on selected binary UCI benchmark data sets and demonstrate that the proposed algorithm is more robust than standard and other types of AdaBoost for noisy data sets.  相似文献   

12.
Recurrent neural networks and robust time series prediction   总被引:22,自引:0,他引:22  
We propose a robust learning algorithm and apply it to recurrent neural networks. This algorithm is based on filtering outliers from the data and then estimating parameters from the filtered data. The filtering removes outliers from both the target function and the inputs of the neural network. The filtering is soft in that some outliers are neither completely rejected nor accepted. To show the need for robust recurrent networks, we compare the predictive ability of least squares estimated recurrent networks on synthetic data and on the Puget Power Electric Demand time series. These investigations result in a class of recurrent neural networks, NARMA(p,q), which show advantages over feedforward neural networks for time series with a moving average component. Conventional least squares methods of fitting NARMA(p,q) neural network models are shown to suffer a lack of robustness towards outliers. This sensitivity to outliers is demonstrated on both the synthetic and real data sets. Filtering the Puget Power Electric Demand time series is shown to automatically remove the outliers due to holidays. Neural networks trained on filtered data are then shown to give better predictions than neural networks trained on unfiltered time series.  相似文献   

13.
姚达  周军  薛质 《计算机工程》2011,37(20):183-185
用于估计计算机视觉模型的传统鲁棒算法均存在估计精度和稳定性不高等问题。为此,结合遗传算法的全局最优性及几何模型估计的特殊性,提出一种强鲁棒性的遗传一致性估计算法,以估计各种误差和错误概率下的计算机视觉几何模型。仿真实验结果表明,相比于RANSAC、MAPSAC、MLESAC等鲁棒算法,该算法在估计精度和鲁棒性方面性能更优。  相似文献   

14.
In this paper we present a new approach for boosting methods for the construction of ensembles of classifiers. The approach is based on using the distribution given by the weighting scheme of boosting to construct a non-linear supervised projection of the original variables, instead of using the weights of the instances to train the next classifier. With this method we construct ensembles that are able to achieve a better generalization error and are more robust to noise presence.It has been proved that AdaBoost method is able to improve the margin of the instances achieved by the ensemble. Moreover, its practical success has been partially explained by this margin maximization property. However, in noisy problems, likely to occur in real-world applications, the maximization of the margin of wrong instances or outliers can lead to poor generalization. We propose an alternative approach, where the distribution of the weights given by the boosting algorithm is used to get a supervised projection. Then, the supervised projection is used to train the next classifier using a uniform distribution of the training instances.The proposed approach is compared with three boosting techniques, namely AdaBoost, GentleBoost and MadaBoost, showing an improved performance on a large set of 55 problems from the UCI Machine Learning Repository, and less sensitiveness to noise in the class labels. The behavior of the proposed algorithm in terms of margin distribution and bias-variance decomposition is also studied.  相似文献   

15.
《Applied Soft Computing》2007,7(3):957-967
In this study, CPBUM neural networks with annealing robust learning algorithm (ARLA) are proposed to improve the problems of conventional neural networks for modeling with outliers and noise. In general, the obtained training data in the real applications maybe contain the outliers and noise. Although the CPBUM neural networks have fast convergent speed, these are difficult to deal with outliers and noise. Hence, the robust property must be enhanced for the CPBUM neural networks. Additionally, the ARLA can be overcome the problems of initialization and cut-off points in the traditional robust learning algorithm and deal with the model with outliers and noise. In this study, the ARLA is used as the learning algorithm to adjust the weights of the CPBUM neural networks. It tunes out that the CPBUM neural networks with the ARLA have fast convergent speed and robust against outliers and noise than the conventional neural networks with robust mechanism. Simulation results are provided to show the validity and applicability of the proposed neural networks.  相似文献   

16.
In this study, a robust wavelet neural network (WNN) is proposed to approximate functions with outliers. In the proposed methodology, firstly, support vector machine with wavelet kernel function (WSVM) is adopted to determine the initial translation and dilation of a wavelet kernel and the weights of WNNs. Then, an adaptive annealing learning algorithm (AALA) is adopted to accommodate the translations, the dilations, and the weights of the WNNs. In the learning procedure, the AALA is proposed to overcome the problems of initialization and the cut-off points in the robust learning algorithm. Hence, when an initial structure of the WNNs is determined by a support vector regression (SVR) approach, the WNNs with AALA (AALA-WNNs) have fast convergence speed and can robust against outliers. Two examples are simulated to verify the feasibility and efficiency of the proposed algorithm.  相似文献   

17.
In this paper, a novel robust maximum entropy clustering algorithm RMEC, as the improved version of the maximum entropy algorithm MEC [2–4], is presented to overcome MEC's drawbacks: very sensitive to outliers and uneasy to label them. Algorithm RMEC incorporates Vapnik's ɛ – insensitive loss function and the new concept of weight factors into its objective function and consequently, its new update rules are derived according to the Lagrangian optimization theory. Compared with algorithm MEC, the main contributions of algorithm RMEC exit in its much better robustness to outliers and the fact that it can effectively label outliers in the dataset using the obtained weight factors. Our experimental results demonstrate its superior performance in enhancing the robustness and labeling outliers in the dataset.  相似文献   

18.
杨军  诸昌钤  彭强 《计算机应用》2006,26(3):582-0585
针对点模型提出了基于前向查找和均值漂移两种鲁棒统计方法的滤波算法。前向查找算法根据残差图自动检测离群点,并将输入的点云数据划分为多个不带离群点的最优局部降噪邻域。对局部邻域进行加权协方差分析,估计出该邻域的最小二乘拟合平面。在局部邻域内估计采样点的核密度函数并通过均值漂移算法计算它的局部最大值点,核密度函数的局部最大值点确定了点云数据的聚类中心并能准确逼近采样点曲面,将每一个采样点漂移到密度函数的局部最大值点,使点云曲面收敛为一个稳定的三维数字模型。实验结果表明,本文的算法是鲁棒的,能在有效剔除点模型表面噪声的同时较好地保持模型表面的尖锐特征。  相似文献   

19.
一种新的语言模型判别训练方法   总被引:1,自引:0,他引:1  
已有的一些判别训练(discriminative training)方法如Boosting为了提高算法的效率,要求损失函数(loss function)是可以求导的,这样的损失函数无法体现最直接的优化目标.而根据最直接优化目标定义的损失函数通常是不可导的阶梯函数的形式.为了解决上述问题,文章提出了一种新的判别训练的方法GAP(Greedy Approximation Processing).这种方法具有很强的通用性,只要满足阶梯函数形式的损失函数都可以通过此算法进行训练.由于阶梯形式的损失函数是不可导的,无法使用梯度下降的方式计算极值并获得特征权值.因此,GAP采用“贪心”算法的方式,顺序地从特征集合中选取特征,通过穷举搜索的方式确定其权值.为了提高GAP算法的速度,作者在GAP算法中引入了特征之间独立的假设,固定特征的更新顺序,提出了GAP的改进算法FGAP(Fast Greedy Approximation Processing).为了证明FGAP算法的有效性,该文将FGAP算法训练的模型应用到日文输入法中.实验结果表明通过FGAP算法训练的语言模型优于Boosting算法训练的模型,与基础模型相比相对错误率下降了15%~19%.  相似文献   

20.
黄媛媛  傅彦 《计算机科学》2005,32(12):191-192
本文通过对RCA算法中遗忘函数的修正,抑制了类间竞争迭代中的病态发散,从而实现了算法的稳健收敛。用该算法分析数据,采用TSK模糊模型对函数进行逼近,仿真表明该方法能有效地排除噪声及孤立点对系统逼近的干扰。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号