首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
针对Boosting类算法生成的个体网络的迭代方式相关性较高,对某些不稳定学习算法的集成结果并不理想的情况,基于Local Boost算法局部误差调整样本权值的思想,提出了基于距离及其权值挑选邻居样本的方法,并通过局部误差产生训练样本种子,采用Lazy Bagging方法生成针对各样本种子的个体网络训练样本集来训练、生成新的个体网络,UCI数据集上实验结果表明,该算法得到的个体网络相关度较小,集成性能较为稳定.  相似文献   

2.
A maximum variance cluster algorithm   总被引:3,自引:0,他引:3  
We present a partitional cluster algorithm that minimizes the sum-of-squared-error criterion while imposing a hard constraint on the cluster variance. Conceptually, hypothesized clusters act in parallel and cooperate with their neighboring clusters in order to minimize the criterion and to satisfy the variance constraint. In order to enable the demarcation of the cluster neighborhood without crucial parameters, we introduce the notion of foreign cluster samples. Finally, we demonstrate a new method for cluster tendency assessment based on varying the variance constraint parameter  相似文献   

3.
A local boosting algorithm for solving classification problems   总被引:1,自引:0,他引:1  
Based on the boosting-by-resampling version of Adaboost, a local boosting algorithm for dealing with classification tasks is proposed in this paper. Its main idea is that in each iteration, a local error is calculated for every training instance and a function of this local error is utilized to update the probability that the instance is selected to be part of next classifier's training set. When classifying a novel instance, the similarity information between it and each training instance is taken into account. Meanwhile, a parameter is introduced into the process of updating the probabilities assigned to training instances so that the algorithm can be more accurate than Adaboost. The experimental results on synthetic and several benchmark real-world data sets available from the UCI repository show that the proposed method improves the prediction accuracy and the robustness to classification noise of Adaboost. Furthermore, the diversity-accuracy patterns of the ensemble classifiers are investigated by kappa-error diagrams.  相似文献   

4.
数据协调测量误差的方差-协方差矩阵(也称Q矩阵)通常是由操作人员根据仪表的精度事先给定的,由于没有考虑仪表精度的变化,很可能会造成数据的不一致或不准确.基于空间冗余的约束残差,本文提出了一种测量方差-协方差Q矩阵的估计方法,有效地减少了对Q矩阵先验知识的依赖.针对常见非线性或双线性问题,提出了估计方法的应用方案.并给出了侦破过失误差的序列补偿法.最后,某焦化厂的应用示例表明.该方法对于获得Q矩阵的最初估计和侦破过失误差是有效的.  相似文献   

5.
This paper presents a novel online learning method for automatically detecting anatomic structures in medical images. Conventional off-line learning methods require collecting a complete set of representative samples prior to training a detector. Once the detector is trained, its performance is fixed. To improve the performance, the detector must be completely retrained, demanding the maintenance of historical training samples. Our proposed online approach eliminates the need for storing historical training samples and is capable of continually improving performance with new samples. We evaluate our approach with three distinct thoracic structures, demonstrating that our approach yields performance competitive with the off-line approach. Furthermore, we investigate the properties of our proposed method in comparison with an online learning method suggested by Grabner and Bischof (IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2006, vol. 1, pp. 260–267, 2006), which is the state of the art, indicating that our proposed method runs faster, offers more stability, improves handling of “catastrophic forgetting”, and simultaneously achieves a satisfactory level of adaptability. The enhanced performance is attributed to our novel online learning structure coupled with more accurate weaker learners based on histograms.  相似文献   

6.
7.
特征权重计算是文本分类过程的基础,传统基于概率的特征权重算法,往往只对词频,逆文档频和逆类频等进行统计,忽略了类别之间的相互关系。而对于多分类问题,类别之间的关系对统计又有重要意义。因此,针对这一不足,本文提出了基于类别方差的特征权重算法,通过计算类别文档频率的方差来度量类别之间的联系,并在搜狗新闻数据集上对五种特征权重算法进行分类实验。结果表明,与其他四种特征权重算法相比,本文提出的算法在F1宏平均和F1微平均上都有较大的提高,提升了文本分类的效果。  相似文献   

8.
In urban areas, GPS signals are often reflected or blocked by buildings, which causes multipath effects and non-line-of-sight (NLOS) reception respectively consequently degrading GPS positioning performance. While improved receiver design can reduce the effect of multipath to some extent, it cannot deal with NLOS. Modelling methods based on measurements have shown promise to reduce the effect of NLOS signal reception. However, this depends on their ability to accurately and reliably classify line-of-sight (LOS), multipath and NLOS signals. The traditional method is based on one feature using signal strength as measured by the carrier to noise ratio, C/N0. However, this feature is ineffective in capturing the characteristics of multipath and NLOS in all environments. In this paper, to improve the accuracy of signal reception classification, we are using the three features of C/N0, pseudorange residuals and satellite elevation angle with a gradient boosting decision tree (GBDT) based classification algorithm. Experiments are carried out to compare the proposed algorithm with classifiers based on decision tree, distance weighted k-nearest neighbour (KNN) and the adaptive network-based fuzzy inference system (ANFIS). Test results from static receivers in urban environments, show that the GBDT based algorithm achieves a classification accuracy of 100%, 82% and 86% for LOS, multipath and NLOS signals, respectively. This is superior to the other three algorithms with the corresponding results of 100%, 82% and 84% for the Distance-Weighted KNN, 99%, 70% and 65% for the ANFIS and 98%, 35% and 95% for the traditional decision tree. With the NLOS detection and exclusion, the proposed GBDT with multi-feature based method can provide a positioning accuracy improvement of 34.1% compared to the traditional C/N0 based method.  相似文献   

9.
基于级联式Boosting方法的人脸检测   总被引:2,自引:0,他引:2  
朱文球  罗三定 《计算机应用》2005,25(9):2128-2130
提出一种基于级联式Boosting方法的人脸检测算法。先用PCA方法对人脸图像进行特征参数的提取,在此基础上,利用算法中的每一个Boosting分类器学习的历史信息,基于线性回归特征消除(RFE)策略,消除AdaBoost中的冗余,据此判别一幅图像是否为人脸图像。在ORL人脸图像库的仿真实验结果显示,这种方法明显提高了检测性能,证明了该算法是有效的。  相似文献   

10.
The effect of regularization on variance error   总被引:2,自引:0,他引:2  
This note addresses the problem of quantifying the effect of noise induced error(so called "variance error") in system estimates found via a regularised cost criterion. It builds on recent work by the authors in which expressions for nonregularised criterions are derived which are exact for finite model order. Those new expressions were established to be very different to previous quantifications that are widely used but based on asymptotic in model order arguments. A key purpose of this note is to expose a rapprochement between these new finite model order, and the pre-existing asymptotic model order quantifications. In so doing, a further new result is established. Namely, that variance error in the frequency domain is dependent on the choice of the point about which regularization is affected.  相似文献   

11.
We present a method to improve the execution time used to build the roadmap in probabilistic roadmap planners. Our method intelligently deactivates some of the configurations during the learning phase and allows the planner to concentrate on those configurations that are most likely going to be useful when building the roadmap. The method can be used with many of the existing sampling algorithms. We ran tests with four simulated robot problems typical in robotics literature. The sampling methods applied were purely random, using Halton numbers, Gaussian distribution, and bridge test technique. In our tests, the deactivation method clearly improved the execution times. Compared with pure random selections, the deactivation method also significantly decreased the size of the roadmap, which is a useful property to simplify roadmap planning tasks.  相似文献   

12.
13.
《Information Fusion》2008,9(1):41-55
Ensemble methods for classification and regression have focused a great deal of attention in recent years. They have shown, both theoretically and empirically, that they are able to perform substantially better than single models in a wide range of tasks. We have adapted an ensemble method to the problem of predicting future values of time series using recurrent neural networks (RNNs) as base learners. The improvement is made by combining a large number of RNNs, each of which is generated by training on a different set of examples. This algorithm is based on the boosting algorithm where difficult points of the time series are concentrated on during the learning process however, unlike the original algorithm, we introduce a new parameter for tuning the boosting influence on available examples. We test our boosting algorithm for RNNs on single-step-ahead and multi-step-ahead prediction problems. The results are then compared to other regression methods, including those of different local approaches. The overall results obtained through our ensemble method are more accurate than those obtained through the standard method, backpropagation through time, on these datasets and perform significantly better even when long-range dependencies play an important role.  相似文献   

14.
In this paper, each one-class problem is regarded as trying to estimate a function that is positive on a desired slab and negative on the complement. The main advantage of this viewpoint is that the loss function and the expected risk can be defined to ensure that the slab can contain as many samples as possible. Inspired by the nature of SVMs, the intuitive margin is also defined. As a result, a new linear optimization problem to maximize the margin and some theoretically motivated learning algorithms are obtained. Moreover, the proposed algorithms can be implemented by boosting techniques to solve nonlinear one-class classifications.  相似文献   

15.
We discuss robustness against mislabeling in multiclass labels for classification problems and propose two algorithms of boosting, the normalized Eta-Boost.M and Eta-Boost.M, based on the Eta-divergence. Those two boosting algorithms are closely related to models of mislabeling in which the label is erroneously exchanged for others. For the two boosting algorithms, theoretical aspects supporting the robustness for mislabeling are explored. We apply the proposed two boosting methods for synthetic and real data sets to investigate the performance of these methods, focusing on robustness, and confirm the validity of the proposed methods.  相似文献   

16.
Artificial Intelligence Review - Retrieving relevant documents from a large set using the original query is a formidable challenge. A generic approach to improve the retrieval process is realized...  相似文献   

17.
协同过滤在数据处理中存在数据稀疏问题,影响推荐算法的准确性。提出融合协同过滤和XGBoost的推荐算法,根据用户对项目的评价以及项目本身所具备的自身特点,挖掘项目和用户的潜在关系,提高算法的推荐准确性。采用百度深度学习框架PaddlePaddle在Book-Crossings数据集上进行实验,实验结果表明,提出的算法和文献中两种算法相比,准确性有显著提升。  相似文献   

18.
In the past few years unlabeled examples and their potential advantage have received a lot of attention. In this paper a new boosting algorithm is presented where unlabeled examples are used to enforce agreement between several different learning algorithms. Not only do the learning algorithms learn from the given training set but they are supposed to do so while agreeing on the unlabeled examples. Similar ideas have been proposed before (for example, the Co-Training algorithm by Mitchell and Blum), but without a proof or under strong assumptions. In our setting, it is only assumed that all learning algorithms are equally adequate for the tasks. A new generalization bound is presented where the use of unlabeled examples results in a better ratio between training-set size and the resulting classifier's quality and thus reduce the number of labeled examples necessary for achieving it. The extent of this improvement depends on the diversity of the learners—a more diverse group of learners will result in a larger improvement whereas using two copies of a single algorithm gives no advantage at all. As a proof of concept, the algorithm, named Agreement Boost, is applied to two test problems. In both cases, using Agreement Boost results in an up to 40% reduction in the number of labeled examples.  相似文献   

19.
为了更好的对混纺纤维中的棉纤维进行含量分析和质量评估,提出了一种棉纤维检测的方法,此方法使用Chain Boosting算法来提高检测的正确率.预处理棉纤维的横截面图,把体现棉纤维的特征的指标从纤维横截面图像中提取出来(包括轴度,复杂度等);然后利用训练好的BP网络作为弱学习算法,使用每一个Boosting分类器学习的历史信息,基于线性回归特征消除(RFE)策略,消除冗余,据此判别一幅纤维截面图像是否为棉纤维截面图像.实验结果表明,该方法实用有效,对检测的正确率有较大幅度的提高.  相似文献   

20.
软集是一种处理不确定数据的理论、工具,通常用于决策论中。软集的参数约简是指删除对决策几乎没有影响的冗余参数,自从0-1线性规划算法提出以来,软集的参数约简问题基本得到了解决,但0-1线性规划算法实现复杂,需要依赖整数规划算法。在此,考虑软集的实际应用背景,将软集与概率论结合,设计出一个在大数据背景下的软集参数约简方法——方差辗转法,该算法的时间复杂度为O(m~2n),而0-1线性规划通常视为NP难问题。方差辗转法实现简单,在物集(或全集)较小,不超过属性集大小的2倍时,效果较差,但随着物集(或全集)大小的增长,效率会逐步上升,最终运算效率会全面优于0-1线性规划算法的,对于约简稠密度高的软集效率会更高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号