首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
主动学习已被证明是提升基于内容图像检索性能的一种重要技术。而相关反馈技术可以有效地减少用户标注。提出一种主动学习算法,带权Co-ASVM,用于改进相关反馈中样本选择的性能。颜色和纹理可以认为是一张图片的两个充分不相关的视图,分别计算颜色和纹理两种特征空间的权值,并在两种特征空间上分别进行SVM学习,对未标注样本进行分类;为了减少反馈样本的冗余,提出一种K-means聚类的主动反馈策略,将未标注样本返回给用户标注。实验表明,该图像检索方法有较高的准确性,并且有不错的检索效果。  相似文献   

2.
In multi-label learning,it is rather expensive to label instances since they are simultaneously associated with multiple labels.Therefore,active learning,which reduces the labeling cost by actively querying the labels of the most valuable data,becomes particularly important for multi-label learning.A good multi-label active learning algorithm usually consists of two crucial elements:a reasonable criterion to evaluate the gain of querying the label for an instance,and an effective classification model,based on whose prediction the criterion can be accurately computed.In this paper,we first introduce an effective multi-label classification model by combining label ranking with threshold learning,which is incrementally trained to avoid retraining from scratch after every query.Based on this model,we then propose to exploit both uncertainty and diversity in the instance space as well as the label space,and actively query the instance-label pairs which can improve the classification model most.Extensive experiments on 20 datasets demonstrate the superiority of the proposed approach to state-of-the-art methods.  相似文献   

3.
Confidence-based active learning   总被引:1,自引:0,他引:1  
This paper proposes a new active learning approach, confidence-based active learning, for training a wide range of classifiers. This approach is based on identifying and annotating uncertain samples. The uncertainty value of each sample is measured by its conditional error. The approach takes advantage of current classifiers' probability preserving and ordering properties. It calibrates the output scores of classifiers to conditional error. Thus, it can estimate the uncertainty value for each input sample according to its output score from a classifier and select only samples with uncertainty value above a user-defined threshold. Even though we cannot guarantee the optimality of the proposed approach, we find it to provide good performance. Compared with existing methods, this approach is robust without additional computational effort. A new active learning method for support vector machines (SVMs) is implemented following this approach. A dynamic bin width allocation method is proposed to accurately estimate sample conditional error and this method adapts to the underlying probabilities. The effectiveness of the proposed approach is demonstrated using synthetic and real data sets and its performance is compared with the widely used least certain active learning method.  相似文献   

4.
徐海龙 《控制与决策》2010,25(2):282-286
针对SVM训练学习过程中难以获得大量带有类标注样本的问题,提出一种基于距离比值不确定性抽样的主动SVM增量训练算法(DRB-ASVM),并将其应用于SVM增量训练.实验结果表明,在保证不影响分类精度的情况下,应用主动学习策略的SVM选择的标记样本数量大大低于随机选择的标记样本数量,从而降低了标记的工作量或代价,并且提高了训练速度.  相似文献   

5.
In classification tasks, active learning is often used to select out a set of informative examples from a big unlabeled dataset. The objective is to learn a classification pattern that can accurately predict labels of new examples by using the selection result which is expected to contain as few examples as possible. The selection of informative examples also reduces the manual effort for labeling, data complexity, and data redundancy, thus improves learning efficiency. In this paper, a new active learning strategy with pool-based settings, called inconsistency-based active learning, is proposed. This strategy is built up under the guidance of two classical works: (1) the learning philosophy of query-by-committee (QBC) algorithm; and (2) the structure of the traditional concept learning model: from-general-to-specific (GS) ordering. By constructing two extreme hypotheses of the current version space, the strategy evaluates unlabeled examples by a new sample selection criterion as inconsistency value, and the whole learning process could be implemented without any additional knowledge. Besides, since active learning is favorably applied to support vector machine (SVM) and its related applications, the strategy is further restricted to a specific algorithm called inconsistency-based active learning for SVM (I-ALSVM). By building up a GS structure, the sample selection process in our strategy is formed by searching through the initial version space. We compare the proposed I-ALSVM with several other pool-based methods for SVM on selected datasets. The experimental result shows that, in terms of generalization capability, our model exhibits good feasibility and competitiveness.  相似文献   

6.
Feature selection via sensitivity analysis of SVM probabilistic outputs   总被引:1,自引:0,他引:1  
Feature selection is an important aspect of solving data-mining and machine-learning problems. This paper proposes a feature-selection method for the Support Vector Machine (SVM) learning. Like most feature-selection methods, the proposed method ranks all features in decreasing order of importance so that more relevant features can be identified. It uses a novel criterion based on the probabilistic outputs of SVM. This criterion, termed Feature-based Sensitivity of Posterior Probabilities (FSPP), evaluates the importance of a specific feature by computing the aggregate value, over the feature space, of the absolute difference of the probabilistic outputs of SVM with and without the feature. The exact form of this criterion is not easily computable and approximation is needed. Four approximations, FSPP1-FSPP4, are proposed for this purpose. The first two approximations evaluate the criterion by randomly permuting the values of the feature among samples of the training data. They differ in their choices of the mapping function from standard SVM output to its probabilistic output: FSPP1 uses a simple threshold function while FSPP2 uses a sigmoid function. The second two directly approximate the criterion but differ in the smoothness assumptions of criterion with respect to the features. The performance of these approximations, used in an overall feature-selection scheme, is then evaluated on various artificial problems and real-world problems, including datasets from the recent Neural Information Processing Systems (NIPS) feature selection competition. FSPP1-3 show good performance consistently with FSPP2 being the best overall by a slight margin. The performance of FSPP2 is competitive with some of the best performing feature-selection methods in the literature on the datasets that we have tested. Its associated computations are modest and hence it is suitable as a feature-selection method for SVM applications. Editor: Risto Miikkulainen.  相似文献   

7.
In classification problems, many different active learning techniques are often adopted to find the most informative samples for labeling in order to save human labors. Among them, active learning support vector machine (SVM) is one of the most representative approaches, in which model parameter is usually set as a fixed default value during the whole learning process. Note that model parameter is closely related to the training set. Hence dynamic parameter is desirable to make a satisfactory learning performance. To target this issue, we proposed a novel algorithm, called active learning SVM with regularization path, which can fit the entire solution path of SVM for every value of model parameters. In this algorithm, we first traced the entire solution path of the current classifier to find a series of candidate model parameters, and then used unlabeled samples to select the best model parameter. Besides, in the initial phase of training, we constructed a training sample sets by using an improved K-medoids cluster algorithm. Experimental results conducted from real-world data sets showed the effectiveness of the proposed algorithm for image classification problems.  相似文献   

8.
增量式支持向量机学习算法是一种重要的在线学习方法。传统的单增量支持向量机学习算法使用一个数据样本更新支持向量机模型。在增加或删除的数据样本点较多时,这种模型更新模式耗时巨大,具体原因是每个被插入或删除的样本都要进行一次模型参数更新的判断。该文提出一种基于参数规划的多重增量式的支持向量机优化训练算法,使用该训练算法,多重的支持向量机的训练时间大为减少。在合成数据集及真实测试数据集上的实验结果显示,该文提出的方法可以大大降低多重支持向量机训练算法的计算复杂度并提高分类器的精度。  相似文献   

9.
一种新的SVM主动学习算法及其在障碍物检测中的应用   总被引:3,自引:0,他引:3  
障碍物检测是智能机器人要解决的非结构复杂环境感知的典型问题之一.在实际情况中,获得大量未标记样本是相对容易的,而标记这些样本则是极其繁琐和费时的工作,当前的研究工作很少涉及到这类问题的解决办法.将SVM主动学习算法引入到障碍物检测中,针对常规的SVM主动学习算法在应用中所遇到的问题和局限性,采用一种动态聚类过程来选取最有代表性样本和根据专家标记与当前SVM分类结果的差值来调整SVM超平面位置的两种策略对其进行了改进,提出了一种新的主动学习算法--KSVMactiv算法,并在真实的野外环境图像库上进行了实验.由实验结果可知:KSVMactiv算法仅用81个样本就能达到很高的检测效果,从而说明它能显著减少数据标记的工作量,且与已有主动学习算法相比收敛速度更快.  相似文献   

10.
为了解决方言辨识系统中训练样本冗余的问题,提出了一种融合多样性测度的汉语方言主动辨识方法。利用SVM分类器选取不确定性的样本。根据样本间分布情况的测度算法,选取出兼具多样性的训练样本,经过多次迭代将这些最具区别性的样本组成训练集。将此训练集重新输入到SVM进行分类辨识。实验结果表明,该方法能有效克服选取样本的冗余,与传统的主动学习方法相比,在同等识别率的情况下,人工标注样本的数量减少了50%。  相似文献   

11.
Multi-label learning deals with data associated with a set of labels simultaneously. Dimensionality reduction is an important but challenging task in multi-label learning. Feature selection is an efficient technique for dimensionality reduction to search an optimal feature subset preserving the most relevant information. In this paper, we propose an effective feature evaluation criterion for multi-label feature selection, called neighborhood relationship preserving score. This criterion is inspired by similarity preservation, which is widely used in single-label feature selection. It evaluates each feature subset by measuring its capability in preserving neighborhood relationship among samples. Unlike similarity preservation, we address the order of sample similarities which can well express the neighborhood relationship among samples, not just the pairwise sample similarity. With this criterion, we also design one ranking algorithm and one greedy algorithm for feature selection problem. The proposed algorithms are validated in six publicly available data sets from machine learning repository. Experimental results demonstrate their superiorities over the compared state-of-the-art methods.   相似文献   

12.
针对现有的主动学习算法在多分类器应用中存在准确率低、速度慢等问题,将基于仿射传播(AP)聚类的主动学习算法引入到多分类支持向量机中,每次迭代主动选择最有利于改善多类SVM分类器性能的N个新样本点添加到训练样本点中进行学习,使得在花费较小标注代价情况下,能够获得较高的分类性能。在多个不同数据集上的实验结果表明,新方法能够有效地减少分类器训练时所需的人工标注样本点的数量,并获得较高的准确率和较好的鲁棒性。  相似文献   

13.
基于深度贝叶斯主动学习的高光谱图像分类   总被引:1,自引:0,他引:1       下载免费PDF全文
针对高光谱图像分类中标记样本获取费时费力,无标记数据难以得到有效利用以及主动学习与深度学习结合难等问题,结合贝叶斯深度学习与主动学习的最新进展,提出一种基于深度贝叶斯的主动学习高光谱图像分类算法。利用少量标记样本训练一个卷积神经网络模型,根据与贝叶斯方法结合的主动学习采样策略从无标记样本中选择模型分类最不确定性的样本,选取的样本经人工标记后加入到训练集重新训练模型,减小模型不确定性,提高模型分类精度。通过PaviaU高光谱图像分类的实验结果表明,在少量的标记样本下,提出的方法比传统的方法分类效果更好。  相似文献   

14.
为解决监督学习过程中难以获得大量带有类标记样本且样本数据标记代价较高的问题,结合主动学习和半监督学习方法,提出基于Tri-training半监督学习和凸壳向量的SVM主动学习算法.通过计算样本集的壳向量,选择最有可能成为支持向量的壳向量进行标记.为解决以往主动学习算法在选择最富有信息量的样本标记后,不再进一步利用未标记样本的问题,将Tri-training半监督学习方法引入SVM主动学习过程,选择类标记置信度高的未标记样本加入训练样本集,利用未标记样本集中有利于学习器的信息.在UCI数据集上的实验表明,文中算法在标记样本较少时获得分类准确率较高和泛化性能较好的SVM分类器,降低SVM训练学习的样本标记代价.  相似文献   

15.
杨菊  李青雯  于化龙 《计算机应用》2015,35(12):3472-3476
针对现有的选择精度主动学习停止准则仅适用于批量样例标注场景这一问题,提出了一种适用于单轮单样例标注场景的改进的选择精度停止准则。该准则通过监督自本轮起前溯的固定学习轮次内的预测标记与真实标记间的匹配关系,对选择精度进行近似的评估计算,匹配度越高则选择精度越高,继而利用滑动时间窗实时监测该选择精度的变化,若当其高于事先设定的阈值,则停止主动学习算法的运行。以基于支持向量机的主动学习方法为例,通过6个基准数据集对该准则的有效性与可行性进行了验证,结果表明当选取合适的阈值时,该准则能找到主动学习停止的合理时机。该方法扩大了选择精度停止准则的适用范围,提升了其实用性。  相似文献   

16.
The least squares support vector machine (LSSVM), like standard support vector machine (SVM) which is based on structural risk minimization, can be obtained by solving a simpler optimization problem than that in SVM. However, local structure information of data samples, especially intrinsic manifold structure, is not taken full consideration in LSSVM. To address this problem and inspired by manifold learning technique, we propose a novel iterative least squares classifier, coined optimal locality preserving least squares support vector machine (OLP-LSSVM). The idea is to combine structural risk minimization and locality preserving criterion in a unified framework to take advantage of the manifold structure of data samples to enhance LSSVM. Furthermore, inspired by the recent development of simultaneous optimization technique, adjacent graph of locality preserving criterion is optimized simultaneously to give rise to improved discriminative performance. The resulting model can be solved by alternating optimization method. The experimental results on several publicly available benchmark data sets show the feasibility and effectiveness of the proposed method.  相似文献   

17.
Domain adaptation learning(DAL) methods have shown promising results by utilizing labeled samples from the source(or auxiliary) domain(s) to learn a robust classifier for the target domain which has a few or even no labeled samples.However,there exist several key issues which need to be addressed in the state-of-theart DAL methods such as sufficient and effective distribution discrepancy metric learning,effective kernel space learning,and multiple source domains transfer learning,etc.Aiming at the mentioned-above issues,in this paper,we propose a unified kernel learning framework for domain adaptation learning and its effective extension based on multiple kernel learning(MKL) schema,regularized by the proposed new minimum distribution distance metric criterion which minimizes both the distribution mean discrepancy and the distribution scatter discrepancy between source and target domains,into which many existing kernel methods(like support vector machine(SVM),v-SVM,and least-square SVM) can be readily incorporated.Our framework,referred to as kernel learning for domain adaptation learning(KLDAL),simultaneously learns an optimal kernel space and a robust classifier by minimizing both the structural risk functional and the distribution discrepancy between different domains.Moreover,we extend the framework KLDAL to multiple kernel learning framework referred to as MKLDAL.Under the KLDAL or MKLDAL framework,we also propose three effective formulations called KLDAL-SVM or MKLDAL-SVM with respect to SVM and its variant μ-KLDALSVM or μ-MKLDALSVM with respect to v-SVM,and KLDAL-LSSVM or MKLDAL-LSSVM with respect to the least-square SVM,respectively.Comprehensive experiments on real-world data sets verify the outperformed or comparable effectiveness of the proposed frameworks.  相似文献   

18.
基于DAGSVM的高炉故障诊断研究   总被引:2,自引:0,他引:2  
针对高炉故障诊断智能化程度低,对操作人员技术水平要求高等不足,提出了基于支持向量机的多类分类故障诊断方法.根据统计学原理,使用核函数将样本映射到高维空间进行训练.综合各种核函数的测试准确率,得到解决该问题的最佳核函数.通过比较不同的多类分类算法,提出了基于DAGSVM的诊断模型.实验结果表明该算法具有较高的识别准确率.  相似文献   

19.
针对深度学习单一模型不能有效处理不确定性预测结果的问题,文中从三支决策出发,将阴影集理论引入图像分类中,构建两阶段图像分类方法.首先,使用卷积神经网络分类样本,获得隶属度矩阵.然后,使用基于阴影集的样本划分算法处理隶属度矩阵,获得分类结果中存在不确定性的部分,即不确定域,进行延迟决策.最后,使用特征融合技术,将SVM作为分类器进行二次分类,降低分类结果的不确定性,提高分类准确率.在CIFAR-10、Caltech 101数据集上的实验验证文中方法的有效性.  相似文献   

20.
将支持向量机(SVM)用于高光谱遥感影像分类的研究,采用决策边界特征提取(DBFE)算法对高光谱影像进行维数约简,以径向基函数(RBF)作为SVM模型的核函数,把混沌优化搜索技术引入到PSO算法中,以基本PSO算法为主体流程,对种群中最好的粒子进行给定步数的混沌优化搜索,以改进基本PSO算法进化后期收敛速度慢、易陷入局部极小值的缺陷。利用改进的混合粒子群优化算法(PSO)来实现SVM模型参数的自动选择,继而构建了一种参数最优的粒子群优化支持向量机(PSO-SVM)多类分类模型。选用220波段的AVIRIS高光谱遥感影像进行了分类试验。结果表明,与采用基于留一法(LOO)网格搜索策略的传统SVM相比,改进后的PSO-SVM算法可以提高分类精度约8.8%。该方法对于小样本、非均衡条件下的遥感影像数据分类非常有效。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号