首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 125 毫秒
1.
针对标记数据不足的多标签分类问题,提出一种新的半监督Boosting算法,即基于函数梯度下降方法给出一种半监督Boosting多标签分类的框架,并将非标记数据的条件熵作为一个正则化项引入分类模型。实验结果表明,对于多标签分类问题,新的半监督Boosting算法的分类效果随着非标记数据数量的增加而显著提高,在各方面都优于传统的监督Boosting算法。  相似文献   

2.
曹莹  苗启广  刘家辰  高琳 《软件学报》2013,24(11):2584-2596
AdaBoost 是一种重要的集成学习元算法,算法最核心的特性“Boosting”也是解决代价敏感学习问题的有效方法.然而,各种代价敏感Boosting 算法,如AdaCost、AdaC 系列算法、CSB 系列算法等采用启发式策略,向AdaBoost 算法的加权投票因子计算公式或权值调整策略中加入代价参数,迫使算法聚焦于高代价样本.然而,这些启发式策略没有经过理论分析的验证,对原算法的调整破坏了AdaBoost 算法最重要的Boosting 特性。AdaBoost算法收敛于贝叶斯决策,与之相比,这些代价敏感Boosting 并不能收敛到代价敏感的贝叶斯决策.针对这一问题,研究严格遵循Boosting 理论框架的代价敏感Boosting 算法.首先,对分类间隔的指数损失函数以及Logit 损失函数进行代价敏感改造,可以证明新的损失函数具有代价意义下的Fisher 一致性,在理想情况下,优化这些损失函数最终收敛到代价敏感贝叶斯决策;其次,在Boosting 框架下使用函数空间梯度下降方法优化新的损失函数得到算法AsyB以及AsyBL.二维高斯人工数据上的实验结果表明,与现有代价敏感Boosting 算法相比,AsyB 和AsyBL 算法能够有效逼近代价敏感贝叶斯决策;UCI 数据集上的测试结果也进一步验证了AsyB 以及AsyBL 算法能够生成有更低错分类代价的代价敏感分类器,并且错分类代价随迭代呈指数下降.  相似文献   

3.
研究通信时滞影响下的多智能体系统的一致性设计问题,其中智能体由单输入单输出的二阶严格正则传递函数表示。对具有向生成树的多智能体系统,提出了一类基于输出的二阶一致性协议。首先,在不考虑通信时滞的情形下,得到了维持一致性的参数条件。进而,在前述参数条件下,得到了闭环多智能体系统在的最大容许时滞的上界。最后,用数值仿真验证了所提理论结论的有效性。与已有研究文献相比,针对单输入单输出严格正则多智能体系统设计了一类基于输出反馈的一致性协议,二阶积分链多智能体系统可看作研究系统的特例。  相似文献   

4.
王迪  王萍  石君志 《控制与决策》2019,34(3):555-560
一致性分类器是建立在一致性预测基础上的分类器,其输出结果具有很高的可靠性,但由于计算框架的限制,学习的时间往往较长.为了加快学习速度,首次将一致性预测与多输出极限学习机相结合,提出基于两者的快速一致性分类算法.该算法利用了极限学习机,能够快速计算样本标签的留一交叉估计的特性,极大地加快了学习速度.算法复杂度分析表明,所提算法的计算复杂度与多输出极限学习机的算法复杂度相同,该算法继承了一致性预测的可靠性特征,即预测的错误率能够被显著性水平参数所控制.在10个公共数据集上的对比实验表明,所提算法具有极快的计算速度,且与其他常用一致性分类器相比,该算法的平均预测标签个数在某些数据集上更少,预测结果更有效.  相似文献   

5.
AdaBoost算法研究进展与展望   总被引:21,自引:0,他引:21  
AdaBoost是最优秀的Boosting算法之一, 有着坚实的理论基础, 在实践中得到了很好的推广和应用. 算法能够将比随机猜测略好的弱分类器提升为分类精度高的强分类器, 为学习算法的设计提供了新的思想和新的方法. 本文首先介绍Boosting猜想提出以及被证实的过程, 在此基础上, 引出AdaBoost算法的起源与最初设计思想;接着, 介绍AdaBoost算法训练误差与泛化误差分析方法, 解释了算法能够提高学习精度的原因;然后, 分析了AdaBoost算法的不同理论分析模型, 以及从这些模型衍生出的变种算法;之后, 介绍AdaBoost算法从二分类到多分类的推广. 同时, 介绍了AdaBoost及其变种算法在实际问题中的应用情况. 本文围绕AdaBoost及其变种算法来介绍在集成学习中有着重要地位的Boosting理论, 探讨Boosting理论研究的发展过程以及未来的研究方向, 为相关研究人员提供一些有用的线索. 最后,对今后研究进行了展望, 对于推导更紧致的泛化误差界、多分类问题中的弱分类器条件、更适合多分类问题的损失函数、 更精确的迭代停止条件、提高算法抗噪声能力以及从子分类器的多样性角度优化AdaBoost算法等问题值得进一步深入与完善.  相似文献   

6.
许多实际问题涉及到多分类技术,该技术能有效地缩小用户与计算机之间的理解差异。在传统的多类Boosting方法中,多类损耗函数未必具有猜测背离性,并且多类弱学习器的结合被限制为线性的加权和。为了获得高精度的最终分类器,多类损耗函数应具有多类边缘极大化、贝叶斯一致性与猜测背离性。此外,弱学习器的缺点可能会限制线性分类器的性能,但它们的非线性结合可以提供较强的判别力。根据这两个观点,设计了一个自适应的多类Boosting分类器,即SOHPBoost算法。在每次迭代中,SOHPBoost算法能够利用向量加法或Hadamard乘积来集成最优的多类弱学习器。这个自适应的过程可以产生多类弱学习的Hadamard乘积向量和,进而挖掘出数据集的隐藏结构。实验结果表明,SOHPBoost算法可以产生较好的多分类性能。  相似文献   

7.
近年来,集成学习(Ensemble Learning,EL)分类方法成为土地覆被分类的研究热点,尤其是Boosting集成分类方法具有分类精度高、泛化能力强,在土地覆被分类中得到了显著的应用。但是,Boosting集成分类方法对噪声很敏感,如果训练样本含有噪声时,Boosting算法可能会失效,这是该方法的局限性。为了解决Boosting集成方法在土地覆被分类中存在的问题,有效克服噪声的影响,减少分类结果中的“椒盐”现象和提高分类精度,提出了基于双树复小波分解的Boosting集成学习分类方法。该方法对影像的光谱波段进行一层双树复小波分解,降低图像的噪声,将分解后的各波段作为Boosting集成学习的输入,得到最终的分类结果。实验先后比较了GBDT、XGBoost、LightGBM 3种Boosting集成学习算法在SPOT 6和Sentinel-2A影像上的分类效果。结果表明:(1)在SPOT 6影像上,3种Boosting集成算法总体分类精度均高于90%;DTCWTLightGBM分类总体精度最高,达到94.73%,Kappa系数为0.93,比LightGBM总体精度提高了1.1%...  相似文献   

8.
推导了使用指数损失函数和0-1损失函数的Boosting 算法的严格在线形式,证明这两种在线Boosting算法最大化样本间隔期望、最小化样本间隔方差.通过增量估计样本间隔的期望和方差,Boosting算法可应用于在线学习问题而不损失分类准确性. UCI数据集上的实验表明,指数损失在线Boosting算法的分类准确性与批量自适应 Boosting (AdaBoost)算法接近,远优于传统的在线Boosting;0-1损失在线Boosting算法分别最小化正负样本误差,适用于不平衡数据问题,并且在噪声数据上分类性能更为稳定.  相似文献   

9.
建立一个既能充分考虑目标表观表达的判别性、又能在后续的跟踪过程中保持特征的时间一致性的模型,是解决跟踪问题的关键.为了提高跟踪算法的特征表达判别性和解决跟踪过程中的特征时效性退化问题,文中提出了一种时间一致性保持的稀疏深度表达的跟踪方法.首先,利用不同卷积层上的特征有不同的属性来构建多任务的稀疏深度表达学习方法,充分挖掘多源信息的相关性.其次,利用相关帧的残差构建时间一致性约束正则项,以对跟踪过程特征的退化起到补偿作用,提高了跟踪算法特征的时间一致性.大量实验视频的跟踪结果显示,相比当前的主流算法,所提算法在复杂背景、快速运动等情况下具有更好的跟踪效果和稳定性.  相似文献   

10.
针对传统的分类器集成的每次迭代通常是将单个最优个体分类器集成到强分类器中,而其它可能有辅助作用的个体分类器被简单抛弃的问题,提出了一种基于Boosting框架的非稀疏多核学习方法MKL-Boost,利用了分类器集成学习的思想,每次迭代时,首先从训练集中选取一个训练子集,然后利用正则化非稀疏多核学习方法训练最优个体分类器,求得的个体分类器考虑了M个基本核的最优非稀疏线性凸组合,通过对核组合系数施加LP范数约束,一些好的核得以保留,从而保留了更多的有用特征信息,差的核将会被去掉,保证了有选择性的核融合,然后将基于核组合的最优个体分类器集成到强分类器中。提出的算法既具有Boosting集成学习的优点,同时具有正则化非稀疏多核学习的优点,实验表明,相对于其它Boosting算法,MKL-Boost可以在较少的迭代次数内获得较高的分类精度。  相似文献   

11.
MongoDB is one of the first commercial distributed databases that support causal consistency.Its implementation of causal consistency combines several research ideas for achieving scalability,fault tolerance,and security.Given its inherent complexity,a natural question arises:"Has MongoDB correctly implemented causal consistency as it claimed?"To address this concern,the Jepsen team has conducted black-box testing of MongoDB.However,this Jepsen testing has several drawbacks in terms of specification,test case generation,implementation of causal consistency checking algorithms,and testing scenarios,which undermine the credibility of its reports.In this work,we propose a more thorough design of Jepsen testing of causal consistency of MongoDB.Specifically,we fully implement the causal consistency checking algorithms proposed by Bouajjani et al.and test MongoDB against three well-known variants of causal consistency,namely CC,CCv,and CM,under various scenarios including node failures,data movement,and network partitions.In addition,we develop formal specifications of causal consistency and their checking algorithms in TLA+,and verify them using the TLC model checker.We also explain how TLA+ specification can be related to Jepsen testing.  相似文献   

12.
一种基于预处理技术的约束满足问题求解算法   总被引:4,自引:0,他引:4  
相容性技术作为约束满足问题的一种有效求解技术,不论是在求解前的预处理过程中,还是在搜索过程中,都扮演着极为重要的角色.文中对预处理阶段的相容性技术进行改进和信息抽取,提出两种应用于搜索过程中的新算法Pre-AC和Pre-AC*,并嵌入到BT框架中,形成新的搜索算法BT MPAC和BT MPAC*,给出了其正确性证明,通过复杂性分析得到Pre-AC和Pre-AC*的时间复杂度分别是O(nd)和O(ed2),明显低于目前最流行的弧相容技术的时间复杂度O(ed3).实验测试结果表明:对于不同类别的用例,新算法的执行效率是弧相容维护算法的2~50倍.  相似文献   

13.
Filtering algorithms are well accepted as a means of speeding up the solution of the consistent labeling problem (CLP). Despite the fact that path consistency does a better job of filtering than arc consistency, AC is still the preferred technique because it has a much lower time complexity. We are implementing parallel path consistency algorithms on multiprocessors and comparing their performance to the best sequential and parallel arc consistency algorithms.(1,2) (See also work by Kerethoet al. (3) and Kasif(4)) Preliminary work has shown linear performance increases for parallelized path consistency and also shown that in many cases performance is significantly better than the theoretical worst case. These two results lead us to believe that parallel path consistency may be a superior filtering technique. Finally, we have implemented path consistency as an outer product computation and have obtained good results (e.g., linear speedup on a 64K-node Connection Machine 2).  相似文献   

14.
Mackworth and Freuder have analyzed the time complexity of several constraint satisfaction algorithms.(1) Mohr and Henderson have given new algorithms, AC-4 and PC-3, for arc and path consistency, respectively, and have shown that the arc consistency algorithm is optimal in time complexity and of the same order space complexity as the earlier algorithms.(2) In this paper, we give parallel algorithms for solving node and arc consistency. We show that any parallel algorithm for enforcing are consistency in the worst case must have O(na) sequential steps, wheren is number of nodes, anda is the number of labels per node. We give several parallel algorithms to do arc consistency. It is also shown that they all have optimal time complexity. The results of running the parallel algorithms on a BBN Butterfly multiprocessor are also presented.This work was partially supported by NSF Grants MCS-8221750, DCR-8506393, and DMC-8502115.  相似文献   

15.
《国际计算机数学杂志》2012,89(7):1321-1333
In this study, we investigate the consistency of half supervised coefficient regularization learning with indefinite kernels. In our setting, the hypothesis space and learning algorithms are based on two different groups of input data which are drawn i.i.d. according to an unknown probability measure ρ X . The only conditions imposed on the kernel function are the continuity and boundedness instead of a Mercer kernel and the output data are not asked to be bounded uniformly. By a mild assumption of unbounded output data and a refined integral operator technique, the generalization error is decomposed into hypothesis error, sample error and approximation error. By estimating these three parts, we deduce satisfactory learning rates with proper choice of the regularization parameter.  相似文献   

16.
General conditions for the strong consistency of estimates made by the method of orthogonal projections are discussed, referring to the problems of identification of regression equations with errors in the independent variables. Computational aspects of the application of different numerical algorithms and their regularization problems are studied in order to find estimates by the method of orthogonal projections.  相似文献   

17.
We extend algorithms for local arc consistency proposed in the literature in order to deal with (absorptive) semirings that may not be invertible. As a consequence, these consistency algorithms can be used as a pre-processing procedure in soft Constraint Satisfaction Problems (CSPs) defined over a larger class of semirings, such as those obtained from the Cartesian product of two (or more) semirings. One important instance of this class of semirings is adopted for multi-objective CSPs. First, we show how a semiring can be transformed into a novel one where the + operator is instantiated with the least common divisor (LCD) between the elements of the original semiring. The LCD value corresponds to the amount we can “safely move” from the binary constraint to the unary one in the arc consistency algorithm. We then propose a local arc consistency algorithm which takes advantage of this LCD operator.  相似文献   

18.
Consistency Algorithms for Multi-Source Warehouse View Maintenance   总被引:1,自引:0,他引:1  
A warehouse is a data repository containing integrated information for efficient querying and analysis. Maintaining the consistency of warehouse data is challenging, especially if the data sources are autonomous and views of the data at the warehouse span multiple sources. Transactions containing multiple updates at one or more sources, e.g., batch updates, complicate the consistency problem. In this paper we identify and discuss three fundamental transaction processing scenarios for data warehousing. We define four levels of consistency for warehouse data and present a new family of algorithms, the Strobe family, that maintain consistency as the warehouse is updated, under the various warehousing scenarios. All of the algorithms are incremental and can handle a continuous and overlapping stream of updates from the sources. Our implementation shows that the algorithms are practical and realistic choices for a wide variety of update scenarios.  相似文献   

19.
Consistency‐based feature selection is an important category of feature selection research, and its advantage over other categories is due to consistency measures used to include the effect of interaction among features into evaluation of relevance of features. Even if features individually appear irrelevant to class labels, they can collectively show strong relevance. In such cases, we say that the features interact with each other. Consistency measures, in this regard, evaluate the collective relevance of a set of features and has been intuitively understood as a metric to measure a distance of an arbitrary feature set from the state of being consistent: A set of features is said to be consistent if, and only if, they as a whole determine class labels. In history, the binary consistency measure, which returns the value 1 if the feature set is consistent and 0 otherwise, was the first consistency measure introduced, and many advanced measures followed. The problem of the binary measure consists in the fact that it always returns 1 if a data set includes no consistent feature set. The measures that followed have solved this problem but sacrificed time efficiency of evaluation. Therefore, feature selection leveraging these measures are not fast enough to apply to large data sets. In this article, we aim to improve time efficiency of consistency‐based feature selection. To achieve the goal, we propose a new idea, which we call data set denoising: We eliminate examples which are viewed as noises from a data set until the data set becomes to include consistent feature sets and then apply the binary measure to find an appropriate feature set that is consistent. In our evaluation through intensive experiments, CWC , a new algorithm that implements data set denoising outperformed in both time efficiency and accuracy the benchmark consistency‐based algorithms. Specifically, CWC was about 31 times faster than the LCC that had been known as the fastest in the literature. Furthermore, in a comparison including feature selection algorithms that are not consistency‐based, CWC has turned out to be one of the fastest and the most accurate feature selection algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号