首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 62 毫秒
1.

在现实应用中,数据通常以流的形式不断积聚,数据的特征可能随时间而演变. 例如,在环境监测任务中,由于旧传感器达到使用寿命和新传感器的部署,数据特征可能会动态地消失或增加. 此外,除了可演变的特征空间,数据标记可能存在噪声. 当特征空间演变和数据标记带噪同时发生时,设计具有理论保障的学习算法,尤其是具备对算法泛化能力的理解是非常具有挑战性的. 为了应对这一挑战,提出了一种在特征演变环境中针对标记带噪数据的差异度量方法,称为容忍标记噪声的演变差异. 该差异度量启发了泛化误差分析,并根据泛化误差的理论分析设计了一种基于深度神经网络实现的学习算法. 合成数据上的实证研究验证了所提差异度量的合理性,而在现实应用任务上的实验则验证了所提算法的有效性.

  相似文献   

2.
刘艳芳  李文斌  高阳 《软件学报》2022,33(4):1315-1325
与研究固定特征空间的传统在线学习相比,特征演化学习通常假设特征不会以任意方式消失或出现,而是随着收集数据特征的硬件设备更换旧特征消失、新特征出现.然而,已有的特征演化学习方法仅利用数据流的一阶信息,而忽略可以挖掘特征之间相关性和显著提高分类性能的二阶信息.提出了一种特征演化的置信-加权学习算法来解决上述问题:首先,引入...  相似文献   

3.
徐海龙 《控制与决策》2010,25(2):282-286
针对SVM训练学习过程中难以获得大量带有类标注样本的问题,提出一种基于距离比值不确定性抽样的主动SVM增量训练算法(DRB-ASVM),并将其应用于SVM增量训练.实验结果表明,在保证不影响分类精度的情况下,应用主动学习策略的SVM选择的标记样本数量大大低于随机选择的标记样本数量,从而降低了标记的工作量或代价,并且提高了训练速度.  相似文献   

4.
演化数据的学习   总被引:1,自引:0,他引:1  
在一些实际问题中,数据的分布随时间的变化而逐渐变化,这类数据的学习问题被称之为演化数据的学习.文中综述了演化数据上的学习方面的研究进展.提出了今后需要关注的一些问题,如数据演化的机制、一般性的假设问题、演化数据分类等等.  相似文献   

5.
基于集成的非均衡数据分类主动学习算法   总被引:1,自引:0,他引:1  
当前,处理类别非均衡数据采用的主要方法之一就是预处理,将数据均衡化之后采取传统的方法加以训练.预处理的方法主要有过取样和欠取样,然而过取样和欠取样都有自己的不足,提出拆分提升主动学习算法SBAL( Split-Boost Active Learning),该算法将大类样本集根据非均衡比例分成多个子集,子集与小类样本集合并,对其采用AdaBoost算法训练子分类器,然后集成一个总分类器,并基于QBC( Query-by-committee)主动学习算法主动选取有效样本进行训练,基本避免了由于增加样本或者减少样本所带来的不足.实验表明,提出的算法对于非均衡数据具有更高的分类精度.  相似文献   

6.
7.
陈锦禾  沈洁 《微机发展》2010,(2):110-113
针对小规模训练样本不足以支持学习器对含有大量潜在不确定因素的未标样本集分类的问题,提出了一种基于信息熵的主动学习方法,引入信息熵的离散事件概率估计理论,通过对未标文档熵值的计算,结合二阶段学习策略,主动学习利用现有知识,结合实验样本环境,主动地选取最有可能的解决问题的样本并标注它们的类别,获得新的参数,重新训练分类器,选择最有利分类器性能的样本,迭代直到未标样本集为空。实验结果表明,该方法取得了较好的分类效果。  相似文献   

8.
基于信息熵的主动学习半监督分类研究   总被引:1,自引:2,他引:1  
针对小规模训练样本不足以支持学习器对含有大量潜在不确定因素的未标样本集分类的问题,提出了一种基于信息熵的主动学习方法,引入信息熵的离散事件概率估计理论,通过对未标文档熵值的计算,结合二阶段学习策略,主动学习利用现有知识,结合实验样本环境,主动地选取最有可能的解决问题的样本并标注它们的类别,获得新的参数,重新训练分类器,选择最有利分类器性能的样本,迭代直到未标样本集为空。实验结果表明,该方法取得了较好的分类效果。  相似文献   

9.
离群点检测任务通常缺少可用的标注数据,且离群数据只占整个数据集的很小一部分,相较于其他的数据挖掘任务,离群点检测的难度较大,尚没有单一的算法适合于所有的场景。因此,结合多样性模型集成和主动学习思想,提出了一种基于主动学习的离群点集成检测方法OMAL(Outlier Mining based on Active Learning)。在主动学习框架指导下,根据各种基学习器的对比分析,选择了基于统计的、基于相似性的、基于子空间划分的三个无监督模型作为基学习器。将各基学习器评判的处于离群和正常边界的数据整合后呈现给人类专家进行标注,以最大化人类专家反馈的信息量;从标注的数据集和各基学习器投票产生的数据集中抽样,基于GBM(Gradient BoostingMachine)训练一个有监督二元分类模型,并将该模型应用于全数据集,得出最终的挖掘结果。实验表明,提出方法的AUC有了较为明显的提升,且具有良好的运行效率,具备较好的实用价值。  相似文献   

10.
张雁  吴保国  吕丹桔  林英 《计算机工程》2014,(6):215-218,229
半监督学习和主动学习都是利用未标记数据,在少量标记数据代价下同时提高监督学习识别性能的有效方法。为此,结合主动学习方法与半监督学习的Tri-training算法,提出一种新的分类算法,通过熵优先采样算法选择主动学习的样本。针对UCI数据集和遥感数据,在不同标记训练样本比例下进行实验,结果表明,该算法在标记样本数较少的情况下能取得较好的效果。将主动学习与Tri-training算法相结合,是提高分类性能和泛化性的有效途径。  相似文献   

11.
Several recent works have studied feature evolvable learning. They usually assume that features would not vanish or appear in an arbitrary way; instead, old features vanish and new features emerge as the hardware device collecting the data features is replaced. However, the existing learning algorithms for feature evolution only utilize the first-order information of data streams and ignore the second-order information which can reveal the correlations between features and thus significantly improve the classification performance. We propose a Confidence-Weighted learning for Feature Evolution (CWFE) algorithm to solve the aforementioned problem. First, second-order confidence-weighted learning is introduced to update the prediction model. Next, to make full use of the learned model, a linear mapping is learned in the overlapping period to recover the old features. Then, the existing model is updated with the recovered old features and, at the same time, a new prediction model is learned with the new features. Furthermore, two ensemble methods are introduced to utilize the two models. Finally, experimental studies show that the proposed algorithms outperform existing feature evolvable learning algorithms.  相似文献   

12.
In multi-instance learning, the training set comprises labeled bags that are composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper studies multi-instance learning from the view of supervised learning. First, by analyzing some representative learning algorithms, this paper shows that multi-instance learners can be derived from supervised learners by shifting their focuses from the discrimination on the instances to the discrimination on the bags. Second, considering that ensemble learning paradigms can effectively enhance supervised learners, this paper proposes to build multi-instance ensembles to solve multi-instance problems. Experiments on a real-world benchmark test show that ensemble learning paradigms can significantly enhance multi-instance learners.  相似文献   

13.
基于多核集成的在线半监督学习方法   总被引:2,自引:1,他引:1  
在很多实时预测任务中,学习器需对实时采集到的数据在线地进行学习.由于数据采集的实时性,往往难以为采集到的所有数据提供标记.然而,目前的在线学习方法并不能利用未标记数据进行学习,致使学得的模型并不能即时反映数据的动态变化,降低其实时响应能力.提出一种基于多核集成的在线半监督学习方法,使得在线学习器即使在接收到没有标记的数据时也能进行在线学习.该方法采用多个定义在不同RKHS中的函数对未标记数据预测的一致程度作为正则化项,在此基础上导出了多核集成在线半监督学习的即时风险函数,然后借助在线凸规划技术进行求解.在UCl数据集上的实验结果以及在网络入侵检测上的应用表明,该方法能够有效利用数据流中未标记数据来提升在线学习的性能.  相似文献   

14.
Evolvable Hardware in Evolutionary Robotics   总被引:1,自引:0,他引:1  
In recent decades the research on Evolutionary Robotics (ER) has developed rapidly. This direction is primarily concerned with the use of evolutionary computing techniques in the design of intelligent and adaptive controllers for robots. Meanwhile, much attention has been paid to a new set of integrated circuits named Evolvable Hardware (EHW), which is capable of reconfiguring its architectures unlimited time based on artificial evolution techniques. This paper surveys the application of evolvable hardware in evolutionary robotics. The evolvable hardware is an emerging research field concerning the development of evolvable robot controllers at the hardware level to adapt to dynamic changes in environments. The context of evolvable hardware and evolutionary robotics is reviewed, and a few representative experiments in the field of robotic hardware evolution are presented. As an alternative to conventional robotic controller designs, the potentialities and limitations of the EHW-based robotic system are discussed and summarized.  相似文献   

15.
周胜  刘三民 《计算机工程》2020,46(5):139-143,149
为解决数据流分类中的概念漂移和噪声问题,提出一种基于样本确定性的多源迁移学习方法。该方法存储多源领域上由训练得到的分类器,求出各源领域分类器对目标领域数据块中每个样本的类别后验概率和样本确定性值。在此基础上,将样本确定性值满足当前阈值限制的源领域分类器与目标领域分类器进行在线集成,从而将多个源领域的知识迁移到目标领域。实验结果表明,该方法能够有效消除噪声数据流给不确定分类器带来的不利影响,与基于准确率选择集成的多源迁移学习方法相比,具有更高的分类准确率和抗噪稳定性。  相似文献   

16.
Evolvable Hardware (EHW) has been proposed as a new method for designing systems for complex real-world applications. However, so far, only relatively simple systems have been shown to be evolvable. In this paper, it is proposed that concepts from biology should be applied to EHW techniques to make EHW more applicable to solving complex problems. One such concept has led to the increased complexity scheme presented, where a system is evolved by evolving smaller sub-systems. Experiments with two different tasks illustrate that inclusion of this scheme substantially reduces the number of generations required for evolution. Further, for the prosthesis control task, the best performance is obtained by the novel approach. The best circuit evolved performs better than the best trained neural network.  相似文献   

17.
Fern  Alan  Givan  Robert 《Machine Learning》2003,53(1-2):71-109
We study resource-limited online learning, motivated by the problem of conditional-branch outcome prediction in computer architecture. In particular, we consider (parallel) time and space-efficient ensemble learners for online settings, empirically demonstrating benefits similar to those shown previously for offline ensembles. Our learning algorithms are inspired by the previously published boosting by filtering framework as well as the offline Arc-x4 boosting-style algorithm. We train ensembles of online decision trees using a novel variant of the ID4 online decision-tree algorithm as the base learner, and show empirical results for both boosting and bagging-style online ensemble methods. Our results evaluate these methods on both our branch prediction domain and online variants of three familiar machine-learning benchmarks. Our data justifies three key claims. First, we show empirically that our extensions to ID4 significantly improve performance for single trees and additionally are critical to achieving performance gains in tree ensembles. Second, our results indicate significant improvements in predictive accuracy with ensemble size for the boosting-style algorithm. The bagging algorithms we tried showed poor performance relative to the boosting-style algorithm (but still improve upon individual base learners). Third, we show that ensembles of small trees are often able to outperform large single trees with the same number of nodes (and similarly outperform smaller ensembles of larger trees that use the same total number of nodes). This makes online boosting particularly useful in domains such as branch prediction with tight space restrictions (i.e., the available real-estate on a microprocessor chip).  相似文献   

18.
Queries and Concept Learning   总被引:14,自引:2,他引:12  
Angluin  Dana 《Machine Learning》1988,2(4):319-342
We consider the problem of using queries to learn an unknown concept. Several types of queries are described and studied: membership, equivalence, subset, superset, disjointness, and exhaustiveness queries. Examples are given of efficient learning methods using various subsets of these queries for formal domains, including the regular languages, restricted classes of context-free languages, the pattern languages, and restricted types of propositional formulas. Some general lower bound techniques are given. Equivalence queries are compared with Valiant's criterion of probably approximately correct identification under random sampling.  相似文献   

19.
结合特征选择与集成学习的密码体制识别方案   总被引:1,自引:0,他引:1  
王旭  陈永乐  王庆生  陈俊杰 《计算机工程》2021,47(1):139-145,153
在密文识别过程中,加密算法是进一步分析密文的必要前提。然而现有密文识别方案存在形式单一的问题,并且在识别多种密码体制时难以应对不同密码体制间存在的差异。分析密文特征对识别效果的影响机制,结合Relief特征选择算法和异质集成学习算法,提出一种可适应多种密码体制识别情景的动态特征识别方案。在36种加密算法产生的密文数据集上进行实验,结果表明,与基于随机森林的密码体制分层识别方案相比,该方案在3类不同密码体制识别情景下的识别准确率分别提高了6.41%、10.03%和11.40%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号