期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Multi-Class Learning by Smoothed Boosting 总被引：1，自引：0，他引：1

Rong Jin Jian Zhang 《Machine Learning》2007,67(3):207-227

AdaBoost.OC has been shown to be an effective method in boosting “weak” binary classifiers for multi-class learning. It employs the Error-Correcting Output Code (ECOC) method to convert a multi-class learning problem into a set of binary classification problems, and applies the AdaBoost algorithm to solve them efficiently. One of the main drawbacks with the AdaBoost.OC algorithm is that it is sensitive to the noisy examples and tends to overfit training examples when they are noisy. In this paper, we propose a new boosting algorithm, named “MSmoothBoost”, which introduces a smoothing mechanism into the boosting procedure to explicitly address the overfitting problem with AdaBoost.OC. We proved the bounds for both the empirical training error and the marginal training error of the proposed boosting algorithm. Empirical studies with seven UCI datasets and one real-world application have indicated that the proposed boosting algorithm is more robust and effective than the AdaBoost.OC algorithm for multi-class learning. Editor: Nicolo Cesa-Bianchi 相似文献

2.

基于主动学习提升朴素贝叶斯

李玲李海军王钲旋王利民《计算机工程与应用》2006,42(19):164-166

提升(Boosting)是改善基分类器学习的有效手段。而研究表明,Boosting对于朴素贝叶斯的改善效果不明显。文章提出了一种新的提升算法——ActiveBoost,ActiveBoost结合主动学习挖掘未分配类别标注中样本的信息,并将不稳定性引入到朴素贝叶斯的构造过程。在UCI机器学习数据库的实验结果证明了该算法的有效性。相似文献

3.

The Weighted Majority Algorithm 总被引：2，自引：0，他引：2

Littlestone N. Warmuth M. K. 《Information and Computation》1994,108(2)

We study the construction of prediction algorithms in a situation in which a learner faces a sequence of trials, with a prediction to be made in each, and the goal of the learner is to make few mistakes. We are interested in the case where the learner has reason to believe that one of some pool of known algorithms will perform well, but the learner does not know which one. A simple and effective method, based on weighted voting, is introduced for constructing a compound algorithm in such a circumstance. We call this method the Weighted Majority Algorithm. We show that this algorithm is robust in the presence of errors in the data. We discuss various versions of the Weighted Majority Algorithm and prove mistake bounds for them that are closely related to the mistake bounds of the best algorithms of the pool. For example, given a sequence of trials, if there is an algorithm in the pool that makes at most m mistakes then the Weighted Majority Algorithm will make at most c(log | | + m) mistakes on that sequence, where c is fixed constant. 相似文献

4.

An Adaptive Version of the Boost by Majority Algorithm 总被引：6，自引：0，他引：6

Freund Yoav 《Machine Learning》2001,43(3):293-318

We propose a new boosting algorithm. This boosting algorithm is an adaptive version of the boost by majority algorithm and combines bounded goals of the boost by majority algorithm with the adaptivity of AdaBoost.The method used for making boost-by-majority adaptive is to consider the limit in which each of the boosting iterations makes an infinitesimally small contribution to the process as a whole. This limit can be modeled using the differential equations that govern Brownian motion. The new boosting algorithm, named BrownBoost, is based on finding solutions to these differential equations.The paper describes two methods for finding approximate solutions to the differential equations. The first is a method that results in a provably polynomial time algorithm. The second method, based on the Newton-Raphson minimization procedure, is much more efficient in practice but is not known to be polynomial. 相似文献

5.

Learning Binary Relations Using Weighted Majority Voting 总被引：2，自引：0，他引：2

Goldman Sally A. Warmuth Manfred K. 《Machine Learning》1995,20(3):245-271

In this paper we demonstrate how weighted majority voting with multiplicative weight updating can be applied to obtain robust algorithms for learning binary relations. We first present an algorithm that obtains a nearly optimal mistake bound but at the expense of using exponential computation to make each prediction. However, the time complexity of our algorithm is significantly reduced from that of previously known algorithms that have comparable mistake bounds. The second algorithm we present is a polynomial time algorithm with a non-optimal mistake bound. Again the mistake bound of our second algorithm is significantly better than previous bounds proven for polynomial time algorithms.A key contribution of our work is that we define a non-pure or noisy binary relation and then by exploiting the robustness of weighted majority voting with respect to noise, we show that both of our algorithms can learn non-pure relations. These provide the first algorithms that can learn non-pure binary relations.The first author was supported in part by NSF grant CCR-91110108 and NSF National Young Investigator Grant CCR-9357707 with matching funds provided by Xerox Corporation, Palo Alto Research Center and WUTA. The second author was supported by ONR grant NO0014-91-J-1162 and NSF grant IRI-9123692. 相似文献

6.

Boosting算法研究

LU Gang CHEN Yong FAN Yong-xin HU Cheng 《数字社区&智能家居》2008,(36)

Boosting算法是近年来在机器学习领域中一种流行的用来提高学习精度的算法。文中以AdaBoost算法为例来介绍Boosting的基本原理。相似文献

7.

Improvement of Boosting Algorithm by Modifying the Weighting Rule

Masayuki Nakamura Hiroki Nomiya Kuniaki Uehara 《Annals of Mathematics and Artificial Intelligence》2004,41(1):95-109

AdaBoost is a method for improving the classification accuracy of a given learning algorithm by combining hypotheses created by the learning alogorithms. One of the drawbacks of AdaBoost is that it worsens its performance when training examples include noisy examples or exceptional examples, which are called hard examples. The phenomenon causes that AdaBoost assigns too high weights to hard examples. In this research, we introduce the thresholds into the weighting rule of AdaBoost in order to prevent weights from being assigned too high value. During learning process, we compare the upper bound of the classification error of our method with that of AdaBoost, and we set the thresholds such that the upper bound of our method can be superior to that of AdaBoost. Our method shows better performance than AdaBoost. 相似文献

8.

SemiBoost: Boosting for Semi-Supervised Learning

Mallapragada P.K. Rong Jin Jain A.K. Yi Liu 《IEEE transactions on pattern analysis and machine intelligence》2009,31(11):2000-2014

Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by using the available unlabeled examples. We call this as the Semi-supervised improvement problem, to distinguish the proposed approach from the existing approaches. We design a metasemi-supervised learning algorithm that wraps around the underlying supervised algorithm and improves its performance using unlabeled data. This problem is particularly important when we need to train a supervised learning algorithm with a limited number of labeled examples and a multitude of unlabeled examples. We present a boosting framework for semi-supervised learning, termed as SemiBoost. The key advantages of the proposed semi-supervised learning approach are: 1) performance improvement of any supervised learning algorithm with a multitude of unlabeled data, 2) efficient computation by the iterative boosting algorithm, and 3) exploiting both manifold and cluster assumption in training classification models. An empirical study on 16 different data sets and text categorization demonstrates that the proposed framework improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples. We also show that the performance of the proposed algorithm, SemiBoost, is comparable to the state-of-the-art semi-supervised learning algorithms. 相似文献

9.

一种结合半监督Boosting方法的迁移学习算法 总被引：1，自引：0，他引：1

洪佳明陈炳超印鉴《小型微型计算机系统》2011,32(11):2169-2173

迁移学习是数据挖掘中的一个研究方向,试图重用相关领域的数据样本,将相关领域的知识”迁移”到新领域中帮助训练.当前,基于实例的迁移学习算法容易产生过度拟合的问题,不能充分利用相关领域中的有用数据,为了避免这个问题,通过引入目标领域的无标记样本参与训练,利用半监督Boosting方法,提出一种新的迁移学习算法,能够对样本的... 相似文献

10.

基于majority和median表决算法的混合表决算法

王昱《计算机光盘软件与应用》2011,(6)

本文提出结合majonty和median的表决算法,问题得到较好改善:首先使用majority表决算法进行工作,当出现无法表决的情况时,计算数据分散程度,如果小于某个设定安全阈值,则使用median表决算法来处理这种情况,若大于安全阈值,则输出一个无法表决的信号.最终的仿真测试表明:结合majority和median表决系统是一种可行的方法,能够提高表决系统输出正确率;错误率虽然有小幅增加,但能够有效改善majority无法输出的情况,同时正确率有一定程度的提高.本文提出的表决算法可以适用于需要提高输出正确率并且可以适当降低安全性的应用环境,测试结果表明对于连续信号和离散信号都有较好的效果. 相似文献

11.

基于Boosting算法的入侵检测 总被引：1，自引：1，他引：1

陈爱斌夏利民《计算机工程》2004,30(11):98-100

提出一种基于Boosting算法的入侵检测方法。先用神经网络初步确定一个入侵检测函数,在此基础上,利用Boosting方法构造一个基于神经网络的入侵检测函数序列,然后以一定的方式将它们组合成一个加强的总检测函数,据此进行入侵检测。实验结果显示,这种方法明显提高了检测性能。相似文献

12.

一种基于Boosting的在线回归算法

王立朱学峰《计算机测量与控制》2008,16(6):840-842

针对实时数据的在线处理问题,提出了一种基于Boosting的在线回归算法,通过对学习机适宜度置信区间的定义,建立了对概念漂移的实时判断方法,利用最新流入的数据块,及时对集成算法中的个体学习机进行逐一迭代更新,从而起到在线学习的效果。通过对标准数据库的数据建立仿真模型,验证这种在线回归算法可以与离线Boosting回归算法达到相似的精度,同时占用较少的存储记忆单元,提高学习速度,能够对学习机参数进行及时调整;该算法还可引入到工业生产中,对生产数据起到实时监控的作用。相似文献

13.

基于Boosting方法的人脸检测 总被引：3，自引：0，他引：3

陈爱斌夏利民赵桂敏《计算机工程与应用》2004,40(3):50-52

该文提出一种基于Boosting方法的人脸检测算法。先用特征脸方法构造一个基于重建图像信噪比的阈值函数用于人脸检测,在此基础上,该文利用Boosting方法构造一个基于信噪比阈值的检测函数序列,然后以一定的方式将它们组合成一个总检测函数,据此判别一幅图像是否为人脸图像。实验结果显示,这种方法明显提高了检测性能。相似文献

14.

Boosting Algorithms for Parallel and Distributed Learning 总被引：1，自引：0，他引：1

Aleksandar Lazarevic Zoran Obradovic 《Distributed and Parallel Databases》2002,11(2):203-229

The growing amount of available information and its distributed and heterogeneous nature has a major impact on the field of data mining. In this paper, we propose a framework for parallel and distributed boosting algorithms intended for efficient integrating specialized classifiers learned over very large, distributed and possibly heterogeneous databases that cannot fit into main computer memory. Boosting is a popular technique for constructing highly accurate classifier ensembles, where the classifiers are trained serially, with the weights on the training instances adaptively set according to the performance of previous classifiers. Our parallel boosting algorithm is designed for tightly coupled shared memory systems with a small number of processors, with an objective of achieving the maximal prediction accuracy in fewer iterations than boosting on a single processor. After all processors learn classifiers in parallel at each boosting round, they are combined according to the confidence of their prediction. Our distributed boosting algorithm is proposed primarily for learning from several disjoint data sites when the data cannot be merged together, although it can also be used for parallel learning where a massive data set is partitioned into several disjoint subsets for a more efficient analysis. At each boosting round, the proposed method combines classifiers from all sites and creates a classifier ensemble on each site. The final classifier is constructed as an ensemble of all classifier ensembles built on disjoint data sets. The new proposed methods applied to several data sets have shown that parallel boosting can achieve the same or even better prediction accuracy considerably faster than the standard sequential boosting. Results from the experiments also indicate that distributed boosting has comparable or slightly improved classification accuracy over standard boosting, while requiring much less memory and computational time since it uses smaller data sets. 相似文献

15.

基于投票机制的融合聚类算法 总被引：1，自引：0，他引：1

蒋盛益《小型微型计算机系统》2007,28(2):306-309

以一趟聚类算法作为划分数据的基本算法,讨论聚类融合问题.通过重复使用一趟聚类算法划分数据,并随机选择阈值和数据输入顺序,得到不同的聚类结果,将这些聚类结果映射为模式间的关联矩阵,在关联矩阵上使用投票机制获得最终的数据划分.在真实数据集和人造数据集上检验了提出的聚类融合算法,并与相关聚类算法进行了对比,实验结果表明,文中提出的算法是有效可行的. 相似文献

16.

On the Existence of Linear Weak Learners and Applications to Boosting

Mannor Shie Meir Ron 《Machine Learning》2002,48(1-3):219-251

We consider the existence of a linear weak learner for boosting algorithms. A weak learner for binary classification problems is required to achieve a weighted empirical error on the training set which is bounded from above by 1/2 – , > 0, for any distribution on the data set. Moreover, in order that the weak learner be useful in terms of generalization, must be sufficiently far from zero. While the existence of weak learners is essential to the success of boosting algorithms, a proof of their existence based on a geometric point of view has been hitherto lacking. In this work we show that under certain natural conditions on the data set, a linear classifier is indeed a weak learner. Our results can be directly applied to generalization error bounds for boosting, leading to closed-form bounds. We also provide a procedure for dynamically determining the number of boosting iterations required to achieve low generalization error. The bounds established in this work are based on the theory of geometric discrepancy. 相似文献

17.

On Incremental Learning for Gradient Boosting Decision Trees

Zhang Chongsheng Zhang Yuan Shi Xianjin Almpanidis George Fan Gaojuan Shen Xiajiong 《Neural Processing Letters》2019,50(1):957-987

Neural Processing Letters - Boosting algorithms, as a class of ensemble learning methods, have become very popular in data classification, owing to their strong theoretical guarantees and... 相似文献

18.

Positional Voting Methods Satisfying the Criteria of Weak Mutual Majority and Condorcet Loser

A. Yu. Kondratev 《Automation and Remote Control》2018,79(8):1489-1514

This paper considers a voting problem in which the individual preferences of electors are defined by the ranked lists of candidates. For single-winner elections, we apply the criterion of weak positional dominance (WPD, PD), which is closely related to the positional scoring rules. Also we formulate the criterion of weak mutual majority (WMM), which is stronger than the majority criterion but weaker than the criterion of mutual majority (MM). Then we construct two modifications for the median voting rule that satisfy the Condorcet loser criterion. As shown below, WPD and WMM are satisfied for the first modification while PD and MM for the second modification. We prove that there is no rule satisfying WPD and MM simultaneously. Finally, we check a list of 37 criteria for the constructed rules. 相似文献

19.

Part 6: Geometrical Structure of Boosting Algorithm

Takafumi Kanamori Takashi Takenouchi Noboru Murata 《New Generation Computing》2006,25(1):117-141

In this article, several boosting methods are discussed, which are notable implementations of the ensemble learning. Starting from the firstly introduced “boosting by filter” which is an embodiment of the proverb “Two heads are better than one”, more advanced versions of boosting methods “AdaBoost” and “U-Boost” are introduced. A geometrical structure and some statistical properties such as consistency and robustness of boosting algorithms are discussed, and then simulation studies are presented for confirming discussed behaviors of algorithms. 相似文献

20.

多分类Boosting算法的一致性

唐轶《计算机工程与应用》2006,42(18):27-28,39

根据样本容量适当选取正则参数可以使得多分类Boosting算法具有一致性。通过分析正则参数对多分类Boosting算法推广能力的影响,建立了正则参数与算法一致性之间的联系。据此得到了Boosting算法具有一致性的充分条件。在样本集确定时,该条件可作为多分类Boosting算法选择正则参数的依据。相似文献