首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
作为监督学习的一种变体,多示例学习(MIL)试图从包中的示例中学习分类器。在多示例学习中,标签与包相关联,而不是与单个示例相关联。包的标签是已知的,示例的标签是未知的。MIL可以解决标记模糊问题,但要解决带有弱标签的问题并不容易。对于弱标签问题,包和示例的标签都是未知的,但它们是潜在的变量。现在有多个标签和示例,可以通过对不同标签进行加权来近似估计包和示例的标签。提出了一种新的基于迁移学习的多示例学习框架来解决弱标签的问题。首先构造了一个基于多示例方法的迁移学习模型,该模型可以将知识从源任务迁移到目标任务中,从而将弱标签问题转换为多示例学习问题。在此基础上,提出了一种求解多示例迁移学习模型的迭代框架。实验结果表明,该方法优于现有多示例学习方法。  相似文献   

2.
目的 传统的多示例学习跟踪在跟踪过程中使用了自学习过程,当目标跟踪失败时分类器很容易退化。针对这个问题,提出一种基于在线特征选取的多示例学习跟踪方法(MILOFS)。方法 首先,该文使用稀疏随机矩阵来简化视频跟踪中图像特征的构建,使用随机矩阵投影来自高维度的图像信息。然后,利用Fisher线性判别模型构建包模型的损失函数,依照示例响应值直接在示例水平构建分类器的判别模型。最后,从梯度下降角度看待在线增强模型,使用梯度增强法来构建分类器的选取模型。结果 对不同场景的图像序列进行对比实验,实验结果中在线自适应增强(OAB)、在线多实例学习跟踪(MILTrack)、加权多实例学习跟踪(WMIL)、在线特征选取多实例学习跟踪(MILOFS)的平均跟踪误差分别为36像素、23像素、24像素、13像素,本文算法在光照变化、发生遮挡,以及形变的情况下都能准确跟踪目标,且具有很高的实时性。结论 基于在线特征选取的多示例学习跟踪,跟踪过程使用梯度增强法并直接在示例水平构建包模型的判别模型,可以有效克服传统多示例学习中的分类器退化问题。  相似文献   

3.
基于多示例学习的图像检索方法   总被引:1,自引:0,他引:1  
由于多示例学习能够有效处理图像的歧义性,因此被应用于基于内容的图像检索(CBIR)。本文提出一种基于多示例学习的CBIR方法,该方法将图像作为多示例包,基于高斯混合模型和改进的EM算法全自动分割图像,并提取颜色、纹理、形状和不变矩等区域信息作为示例向量生成测试图像包。根据用户选择的实例图像生成正包和反包,使用多种多示例学习算法进行学习,实现图像检索和相关反馈,得到了较好的效果。  相似文献   

4.
为了解决汉语方言模型设计较为单一的问题,提高方言辨识的效率,提出了一种基于联合多样性密度的汉语方言辨识方法。多样性密度算法是多示例学习中的一种经典算法,联合多样性密度算法是对其的改进应用。该方法首先将方言进行预分类为多个小类,然后将各小类方言进行多示例包生成,并通过期望最大多样性密度算法进行多示例学习,得到的多个多样性密度点作为方言的多示例模型,最后提出平均最近距离算法进行模式分类。该方法在训练模型时得到的方言模型更为全面、完整,在模式分类时考虑了未知包中每个示例的影响,提高了辨识系统的效率。  相似文献   

5.
多示例学习中,包空间特征描述包容易忽略包中的局部信息,示例空间特征描述包容易忽略包的整体结构信息.针对上述问题,提出融合包空间特征和示例空间特征的多示例学习方法.首先建立图模型表达包中示例之间的关系,将图模型转化为关联矩阵以构建包空间特征;其次筛选出正包中与正包的类别的相关性比较强的示例和负包中与正包的类别的相关性比较弱的示例,将示例特征分别作为正包和负包的示例空间特征;最后用Gaussian RBF核将包空间和示例空间特征映射到相同的特征空间,采用基于权重的特征融合方法进行特征融合.在多示例的基准数据集、公开的图像数据集和文本数据集上进行实验的结果表明,该方法提高了分类效果.  相似文献   

6.
目标跟踪是计算机视觉领域中研究的热点问题。当前,基于多示例学习的目标跟踪算法引起了较多的关注。在研究多示例学习算法的基础上,针对原始的多示例学习目标跟踪算法中使用运动模型的不足,提出一种改进的基于在线学习的目标跟踪方法。该方法首先根据方向直方图局部特征(HOG特征)来描述目标,然后通过粒子滤波方法对目标位置进行预测,再用基于Boosting的在线多示例学习方法来建立描述目标的模型和分类器,最后在下一帧的图像中利用该分类器来跟踪目标,同时在线更新分类器。通过实验表明,改进的方法可以有效地提高目标跟踪精度和算法的鲁棒性。  相似文献   

7.
目的 在传统糖尿病视网膜病变(糖网)诊断系统中,微动脉瘤和出血斑病灶检测的精确性决定了最终诊断性能。目前的检测诊断方法为了保证高敏感性而产生了大量假阳性样本,由于数据集没有标注病灶区域导致无法有效地建立监督性分类模型以去除假阳性。为了解决监督性学习在糖网诊断中的问题,提出一种基于多核多示例学习的糖网病变诊断方法。方法 首先,检测疑似的微动脉瘤和出血斑病灶区域,并将其视为多示例学习模型中的示例,而将整幅图像视为示例包,从而将糖网诊断转化为多示例学习问题;其次,提取病灶区域的特征对示例进行描述,并通过极限学习机(ELM)分类算法过滤不相关示例以提升后续多示例学习的分类性能;最后,构建多核图的多示例学习模型对健康图像和糖网病变图像进行分类,以实现糖网病变的诊断。结果 通过对国际公共数据集MESSIDOR进行糖网病变诊断评估实验,获得的准确率为90.1%,敏感性为92.4%,特异性为91.4%,ROC(receiver operating characteristic)曲线下面积为0.932,相比其他算法具有较大性能优势。结论 基于多核多示例学习方法在无需提供病灶标注的情况下,能够高效自动地对糖网病变进行诊断,从而既能避免医学图像中标注病灶的费时费力,又可以免除分类算法中假阳性去除的问题,获得较好的效果。  相似文献   

8.
针对现有的大部分多示例多标记(MIML)算法都没有考虑如何更好地表示对象特征这一问题,将概率潜在语义分析(PLSA)模型和神经网络(NN)相结合,提出了基于主题模型的多示例多标记学习方法。算法通过概率潜在语义分析模型学习到所有训练样本的潜在主题分布,该过程是一个特征学习的过程,用于学习到更好的特征表达,用学习到的每个样本的潜在主题分布作为输入来训练神经网络。当给定一个测试样本时,学习测试样本的潜在主题分布,将学习到的潜在主题分布输入到训练好的神经网络中,从而得到测试样本的标记集合。与两种经典的基于分解策略的多示例多标记算法相比,实验结果表明提出的新方法在现实世界中的两种多示例多标记学习任务中具有更优越的性能。  相似文献   

9.
在多示例学习中引入利用未标记示例的机制,能降低训练的成本并提高学习器的泛化能力。当前半监督多示例学习算法大部分是基于对包中的每一个示例进行标记,把多示例学习转化为一个单示例半监督学习问题。考虑到包的类标记由包中示例及包的结构决定,提出一种直接在包层次上进行半监督学习的多示例学习算法。通过定义多示例核,利用所有包(有标记和未标记)计算包层次的图拉普拉斯矩阵,作为优化目标中的光滑性惩罚项。在多示例核所张成的RKHS空间中寻找最优解被归结为确定一个经过未标记数据修改的多示例核函数,它能直接用在经典的核学习方法上。在实验数据集上对算法进行了测试,并和已有的算法进行了比较。实验结果表明,基于半监督多示例核的算法能够使用更少量的训练数据而达到与监督学习算法同样的精度,在有标记数据集相同的情况下利用未标记数据能有效地提高学习器的泛化能力。  相似文献   

10.
黎铭  周志华 《计算机科学》2004,31(Z2):152-155
1引言 上世纪90年代中期,多示例学习这个概念在Dietterich等人[1]对药物活性预测问题的研究中被首先提出.凭借其独特的性质和广泛的应用前景,多示例学习被认为是和监督学习、非监督学习、强化学习并列的一种学习框架[2].和监督学习相比,多示例学习中的训练集不再是由若干示例组成,取而代之的是一组带有概念标记的包(bag),每一个包是若干没有概念标记的示例集合.如果一个包中至少存在一个正例,则该包被标记为正包;如果一个包不含有任何正例,则该包为反包.学习系统通过对已经标定类别的包进行学习来建立模型,希望尽可能正确地预测不曾遇到过的包的概念标记.  相似文献   

11.
本文利用信息论中信道容量、最大似然译码准则等概念,提出一个新的示例学习方法IBLE,此方法不依赖类别先验概率,特征间为强相关,具有直观的知识表示,将它用于质谱解析,结果很好,八类化合物平均正确预测率为93.96%,高于专家水平。  相似文献   

12.
示例学习算法IBLE和ID3的比较研究   总被引:3,自引:0,他引:3  
为了比较研究IBLE算法和ID_3算法的学习性能,本文用大量的质谱数据对两种算法做了学习实验。经过学习,IBLE 的平均预测率为93.96%。而ID_3为81.76%,而且IBLE 获得的知识在表示和内容上与专家知识具有较高一致性。文中对两种算法出现上述差异的原因进行了理论分析。  相似文献   

13.
一种增量贝叶斯分类模型   总被引:40,自引:0,他引:40  
分类一直是机器学习,模型识别和数据挖掘研究的核心问题,从海量数据中学习分类知识,尤其是当获得大量的带有类别标注的样本代价较高时,增量学习是解决该问题的有效途径,该文将简单贝叶期方法应用于增量分类中,提出了一种增量贝叶斯学习模型,给出了增量贝叶斯推理过程,包括增量地修正分类器参数和增量地分类测试样本,实验结果表明,该算法是可行的和有效。  相似文献   

14.
Most machine learning tasks in data classification and information retrieval require manually labeled data examples in the training stage. The goal of active learning is to select the most informative examples for manual labeling in these learning tasks. Most of the previous studies in active learning have focused on selecting a single unlabeled example in each iteration. This could be inefficient, since the classification model has to be retrained for every acquired labeled example. It is also inappropriate for the setup of information retrieval tasks where the user's relevance feedback is often provided for the top K retrieved items. In this paper, we present a framework for batch mode active learning, which selects a number of informative examples for manual labeling in each iteration. The key feature of batch mode active learning is to reduce the redundancy among the selected examples such that each example provides unique information for model updating. To this end, we employ the Fisher information matrix as the measurement of model uncertainty, and choose the set of unlabeled examples that can efficiently reduce the Fisher information of the classification model. We apply our batch mode active learning framework to both text categorization and image retrieval. Promising results show that our algorithms are significantly more effective than the active learning approaches that select unlabeled examples based only on their informativeness for the classification model.  相似文献   

15.
《Information and Computation》2007,205(11):1671-1684
The present study aims at insights into the nature of incremental learning in the context of Gold’s model of identification in the limit. With a focus on natural requirements such as consistency and conservativeness, incremental learning is analysed both for learning from positive examples and for learning from positive and negative examples. The results obtained illustrate in which way different consistency and conservativeness demands can affect the capabilities of incremental learners. These results may serve as a first step towards characterising the structure of typical classes learnable incrementally and thus towards elaborating uniform incremental learning methods.  相似文献   

16.
When sensing its environment, an agent often receives information that only partially describes the current state of affairs. The agent then attempts to predict what it has not sensed, by using other pieces of information available through its sensors. Machine learning techniques can naturally aid this task, by providing the agent with the rules to be used for making these predictions. For this to happen, however, learning algorithms need to be developed that can deal with missing information in the learning examples in a principled manner, and without the need for external supervision. We investigate this problem herein.We show how the Probably Approximately Correct semantics can be extended to deal with missing information during both the learning and the evaluation phase. Learning examples are drawn from some underlying probability distribution, but parts of them are hidden before being passed to the learner. The goal is to learn rules that can accurately recover information hidden in these learning examples. We show that for this to be done, one should first dispense the requirement that rules should always make definite predictions; “don't know” is sometimes necessitated. On the other hand, such abstentions should not be done freely, but only when sufficient information is not present for definite predictions to be made. Under this premise, we show that to accurately recover missing information, it suffices to learn rules that are highly consistent, i.e., rules that simply do not contradict the agent's sensory inputs. It is established that high consistency implies a somewhat discounted accuracy, and that this discount is, in some defined sense, unavoidable, and depends on how adversarially information is hidden in the learning examples.Within our proposed learning model we prove that any PAC learnable class of monotone or read-once formulas is also learnable from incomplete learning examples. By contrast, we prove that parities and monotone-term 1-decision lists, which are properly PAC learnable, are not properly learnable under the new learning model. In the process of establishing our positive and negative results, we re-derive some basic PAC learnability machinery, such as Occam's Razor, and reductions between learning tasks. We finally consider a special case of learning from partial learning examples, where some prior bias exists on the manner in which information is hidden, and show how this provides a unified view of many previous learning models that deal with missing information.We suggest that the proposed learning model goes beyond a simple extension of supervised learning to the case of incomplete learning examples. The principled and general treatment of missing information during learning, we argue, allows an agent to employ learning entirely autonomously, without relying on the presence of an external teacher, as is the case in supervised learning. We call our learning model autodidactic to emphasize the explicit disassociation of this model from any form of external supervision.  相似文献   

17.
近年来,深度学习在计算机视觉领域表现出优异的性能,然而研究者们却发现深度学习系统并不具备良好的鲁棒性,对深度学习系统的输入添加少许的人类无法察觉的干扰就能导致深度学习模型失效,这些使模型失效的样本被研究者们称为对抗样本。我们提出迭代自编码器,一种全新的防御对抗样本方案,其原理是把远离流形的对抗样本推回到流形周围。我们先把输入送给迭代自编码器,然后将重构后的输出送给分类器分类。在正常样本上,经过迭代自编码器的样本分类准确率和正常样本分类准确率类似,不会显著降低深度学习模型的性能;对于对抗样本,我们的实验表明,即使使用最先进的攻击方案,我们的防御方案仍然拥有较高的分类准确率和较低的攻击成功率。  相似文献   

18.
Hau  David T.  Coiera  Enrico W. 《Machine Learning》1997,26(2-3):177-211
The automated construction of dynamic system models is an important application area for ILP. We describe a method that learns qualitative models from time-varying physiological signals. The goal is to understand the complexity of the learning task when faced with numerical data, what signal processing techniques are required, and how this affects learning. The qualitative representation is based on Kuipers' QSIM. The learning algorithm for model construction is based on Coiera's GENMODEL. We show that QSIM models are efficiently PAC learnable from positive examples only, and that GENMODEL is an ILP algorithm for efficiently constructing a QSIM model. We describe both GENMOEL which performs RLGG on qualitative states to learn a QSIM model, and the front-end processing and segmenting stages that transform a signal into a set of qualitative states. Next we describe results of experiments on data from six cardiac bypass patients. Useful models were obtained, representing both normal and abnormal physiological states. Model variation across time and across different levels of temporal abstraction and fault tolerance is explored. The assumption made by many previous workers that the abstraction of examples from data can be separated from the learning task is not supported by this study. Firstly, the effects of noise in the numerical data manifest themselves in the qualitative examples. Secondly, the models learned are directly dependent on the initial qualitative abstraction chosen.  相似文献   

19.
In classification tasks, active learning is often used to select out a set of informative examples from a big unlabeled dataset. The objective is to learn a classification pattern that can accurately predict labels of new examples by using the selection result which is expected to contain as few examples as possible. The selection of informative examples also reduces the manual effort for labeling, data complexity, and data redundancy, thus improves learning efficiency. In this paper, a new active learning strategy with pool-based settings, called inconsistency-based active learning, is proposed. This strategy is built up under the guidance of two classical works: (1) the learning philosophy of query-by-committee (QBC) algorithm; and (2) the structure of the traditional concept learning model: from-general-to-specific (GS) ordering. By constructing two extreme hypotheses of the current version space, the strategy evaluates unlabeled examples by a new sample selection criterion as inconsistency value, and the whole learning process could be implemented without any additional knowledge. Besides, since active learning is favorably applied to support vector machine (SVM) and its related applications, the strategy is further restricted to a specific algorithm called inconsistency-based active learning for SVM (I-ALSVM). By building up a GS structure, the sample selection process in our strategy is formed by searching through the initial version space. We compare the proposed I-ALSVM with several other pool-based methods for SVM on selected datasets. The experimental result shows that, in terms of generalization capability, our model exhibits good feasibility and competitiveness.  相似文献   

20.
Imitation is an important learning mechanism of widespread utility and common occurrence. This article presents a theory and working computational model of the detailed mechanisms of imitation. The model is in the restricted domain of the learning of pencil and paper procedures. The task that is modelled is of a teacher demonstrating the steps of a procedure, such as long division to a student by means of one or more examples. Such a task can be learned by an imitation-learning mechanism, but the mechanism has a much wider range of application. Imitation is treated as a four-stage process: the events performed by the teacher are segmented by the learner; the events are encoded and explained in terms of spatial relations between objects; repeated patterns in the events are recognized; and finally, different examples are merged together. This model is implemented as a computer program learning algorithms from worked examples (LAWE).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号