首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
基于主动学习和半监督学习的多类图像分类   总被引:5,自引:0,他引:5  
陈荣  曹永锋  孙洪 《自动化学报》2011,37(8):954-962
多数图像分类算法需要大量的训练样本对分类器模型进行训练.在实际应用中, 对大量样本进行标注非常枯燥、耗时.对于一些特殊图像,如合成孔径雷达 (Synthetic aperture radar, SAR)图像, 对其内容判读非常困难,因此能够获得的标注样本数量非常有限. 本文将基于最优标号和次优标号(Best vs second-best, BvSB)的主动学习和带约束条件的自学习(Constrained self-training, CST) 引入到基于支持向量机(Support vector machine, SVM)分类器的图像分类算法中,提出了一种新的图像分类方法.通过BvSB 主动学习去挖掘那些对当前分类器模型最有价值的样本进行人工标注,并借助CST半 监督学习进一步利用样本集中大量的未标注样本,使得在花费较小标注代价情况下, 能够获得良好的分类性能.将新方法与随机样本选择、基于熵的不确定性采样主动学 习算法以及BvSB主动学习方法进行了性能比较.对3个光学图像集及1个SAR图像集分类 问题的实验结果显示,新方法能够有效地减少分类器训练时所需的人工标注样本的数 量,并获得较高的准确率和较好的鲁棒性.  相似文献   

2.
基于集成学习的半监督情感分类方法研究   总被引:1,自引:0,他引:1  
情感分类旨在对文本所表达的情感色彩类别进行分类的任务。该文研究基于半监督学习的情感分类方法,即在很少规模的标注样本的基础上,借助非标注样本提高情感分类性能。为了提高半监督学习能力,该文提出了一种基于一致性标签的集成方法,用于融合两种主流的半监督情感分类方法:基于随机特征子空间的协同训练方法和标签传播方法。首先,使用这两种半监督学习方法训练出的分类器对未标注样本进行标注;其次,选取出标注一致的未标注样本;最后,使用这些挑选出的样本更新训练模型。实验结果表明,该方法能够有效降低对未标注样本的误标注率,从而获得比任一种半监督学习方法更好的分类效果。  相似文献   

3.
本文提出一种基于半监督主动学习的算法,用于解决在建立动态贝叶斯网络(DBN)分类模型时遇到的难以获得大量带有类标注的样本数据集的问题.半监督学习可以有效利用未标注样本数据来学习DBN分类模型,但是在迭代过程中易于加入错误的样本分类信息,并因而影响模型的准确性.在半监督学习中借鉴主动学习,可以自主选择有用的未标注样本来请求用户标注.把这些样本加入训练集之后,能够最大程度提高半监督学习对未标注样本分类的准确性.实验结果表明,该算法能够显著提高DBN学习器的效率和性能,并快速收敛于预定的分类精度.  相似文献   

4.
对于建立动态贝叶斯网络(DBN)分类模型时,带有类标注样本数据集获得困难的问题,提出一种基于EM和分类损失的半监督主动DBN学习算法.半监督学习中的EM算法可以有效利用未标注样本数据来学习DBN分类模型,但是由于迭代过程中易于加入错误的样本分类信息而影响模型的准确性.基于分类损失的主动学习借鉴到EM学习中,可以自主选择有用的未标注样本来请求用户标注,当把这些样本加入训练集后能够最大程度减少模型对未标注样本分类的不确定性.实验表明,该算法能够显著提高DBN学习器的效率和性能,并快速收敛于预定的分类精度.  相似文献   

5.
半监督学习中当未标注样本与标注样本分布不同时,将导致分类器偏离目标数据的主题,降低分类器的正确性.文中采用迁移学习技术,提出一种TranCo-Training分类模型.每次迭代,根据每个未标注样本与其近邻标注样本的分类一致性计算其迁移能力,并根据迁移能力从辅助数据集向目标数据集迁移实例.理论分析表明,辅助样本的迁移能力与其训练错误损失成反比,该方法能将训练错误损失最小化,避免负迁移,从而解决半监督学习中的主题偏离问题.实验表明,TranCo-Training优于随机选择未标注样本的RdCo-Training算法,尤其是给定少量的标注目标样本和大量的辅助未标注样本时.  相似文献   

6.
为克服传统的全监督机器学习模型的训练依赖于大量的标注样本的弱点,给出一种半监督学习和主动学习相结合的算法。根据主动学习选择策略选择最有价值的句子来标注,结合半监督来充分利用未标注的句子。结合汉语语料的特点,改进主动学习选择策略。实验结果表明,与采用随机选择标注样本相比,在使用相同数目的训练样本的情况下,该算法可以使学习器的F-score调高10.2%,在分类器到达相同性能的情况下,人工标注量可以减少32%,学习器对标注样本的需求得到了有效降低。  相似文献   

7.
目的在多标签有监督学习框架中,构建具有较强泛化性能的分类器需要大量已标注训练样本,而实际应用中已标注样本少且获取代价十分昂贵。针对多标签图像分类中已标注样本数量不足和分类器再学习效率低的问题,提出一种结合主动学习的多标签图像在线分类算法。方法基于min-max理论,采用查询最具代表性和最具信息量的样本挑选策略主动地选择待标注样本,且基于KKT(Karush-Kuhn-Tucker)条件在线地更新多标签图像分类器。结果在4个公开的数据集上,采用4种多标签分类评价指标对本文算法进行评估。实验结果表明,本文采用的样本挑选方法比随机挑选样本方法和基于间隔的采样方法均占据明显优势;当分类器达到相同或相近的分类准确度时,利用本文的样本挑选策略选择的待标注样本数目要明显少于采用随机挑选样本方法和基于间隔的采样方法所需查询的样本数。结论本文算法一方面可以减少获取已标注样本所需的人工标注代价;另一方面也避免了传统的分类器重新训练时利用所有数据所产生的学习效率低下的问题,达到了当新数据到来时可实时更新分类器的目的。  相似文献   

8.
提出了一种基于高斯混合模型核的半监督支持向量机(SVM)分类算法.通过构造高斯混合模型核SVM分类器提供未标示样本信息,使得SVM算法在学习标示样本信息的同时,能够兼顾整个训练样本集合的聚类假设.实验部分将该算法同传统SVM算法、直推式支持向量机(TSVM)以及随机游走(RW)半监督算法进行分类性能比较,结果证明该算法在拥有较少标示样本训练的情况下分类性能也有所提高且具有较高的鲁棒性.  相似文献   

9.
梁爽  孙正兴 《软件学报》2009,20(5):1301-1312
为了解决草图检索相关反馈中小样本训练、数据不对称及实时性要求这3个难点问题,提出了一种小样本增量有偏学习算法.该算法将主动式学习、有偏分类和增量学习结合起来,对相关反馈过程中的小样本有偏学习问题进行建模.其中,主动式学习通过不确定性采样,选择最佳的用户标注样本,实现有限训练样本条件下分类器泛化能力的最大化;有偏分类通过构造超球面区别对待正例和反例,准确挖掘用户目标类别;每次反馈循环中新加入的样本则用于分类器的增量学习,在减少分类器训练时间的同时积累样本信息,进一步缓解小样本问题.实验结果表明,该算法可以有效地改善草图检索性能,也适用于图像检索和三维模型检索等应用领域.  相似文献   

10.
机器学习中的监督学习算法需要用有标记样本训练分类模型。而收集训练样本,并进行分类的过程,需要耗费大量人力物力以及时间。因此,如何高效率地完成图像分类一直是业内研究的热点。提出了一种基于霍夫森林和半监督学习的图像分类算法,能用较少的样本训练分类器,并在分类的过程中不断获取新的训练样本。并对部分训练结果加以人工标注,该方法有效提高了标注效率。利用COREL数据对该算法进行了实验验证,结果表明,该算法可以利用少量的训练样本,得到令人满意的标注精确度,提高人工效率。  相似文献   

11.
In this paper, a novel spectral-spatial hyperspectral image classification method has been proposed by designing hierarchical subspace switch ensemble learning algorithm. First, the hyperspectral images are processed by fast bilateral filtering to get the spatial features. The spectral features and spatial features are combined to form the initial feature set. Second, Hierarchical instance learning based on iterative means clustering method is designed to obtain hierarchical instance space. Third, random subspace method (RSM) is used for sampling the features and samples, thereby forming multiple sub sample set. After that, semi-supervised learning (S2L) is applied to choose test samples for improving classification performance without touching the class labels. Then, micro noise linear dimension reduction (mNLDR) is used for dimension reduction. Afterwards, ensemble multiple kernels SVM(EMK_SVM) are used for stable classification results. Finally, final classification results are obtained by combining classification results with voting strategy. Experimental results on real hyperspectral scenes demonstrate that the proposed method can effectively improve the classification performance apparently.  相似文献   

12.
Software defect prediction can help us better understand and control software quality. Current defect prediction techniques are mainly based on a sufficient amount of historical project data. However, historical data is often not available for new projects and for many organizations. In this case, effective defect prediction is difficult to achieve. To address this problem, we propose sample-based methods for software defect prediction. For a large software system, we can select and test a small percentage of modules, and then build a defect prediction model to predict defect-proneness of the rest of the modules. In this paper, we describe three methods for selecting a sample: random sampling with conventional machine learners, random sampling with a semi-supervised learner and active sampling with active semi-supervised learner. To facilitate the active sampling, we propose a novel active semi-supervised learning method ACoForest which is able to sample the modules that are most helpful for learning a good prediction model. Our experiments on PROMISE datasets show that the proposed methods are effective and have potential to be applied to industrial practice.  相似文献   

13.

In this paper, we propose a novel method, called random subspace method (RSM) based on tensor (Tensor-RS), for face recognition. Different from the traditional RSM which treats each pixel (or feature) of the face image as a sampling unit, thus ignores the spatial information within the face image, the proposed Tensor-RS regards each small image region as a sampling unit and obtains spatial information within small image regions by using reshaping image and executing tensor-based feature extraction method. More specifically, an original whole face image is first partitioned into some sub-images to improve the robustness to facial variations, and then each sub-image is reshaped into a new matrix whose each row corresponds to a vectorized small sub-image region. After that, based on these rearranged newly formed matrices, an incomplete random sampling by row vectors rather than by features (or feature projections) is applied. Finally, tensor subspace method, which can effectively extract the spatial information within the same row (or column) vector, is used to extract useful features. Extensive experiments on four standard face databases (AR, Yale, Extended Yale B and CMU PIE) demonstrate that the proposed Tensor-RS method significantly outperforms state-of-the-art methods.

  相似文献   

14.
关系抽取是信息抽取中一项重要任务,在处理问答对形式的文本时,除了文本中实体间的关系抽取之外,作为连接问句和答句之间关系的提问模式同样需要抽取。通过有监督的标注算法(条件随机场)与基于模板元组自举的半监督算法的结合在抽取实体间关系时有不错的表现。但传统半监督中发现句式模板的方式难以迁移到提问模式抽取中,因此,本文提出了一种基于sentence2vec技术与半监督算法结合的模型。对于最终实验,本文采用随机抽样进行验证。实验结果表明,相较于传统的半监督算法,本文的方法得到了更高的准确率和召回率.  相似文献   

15.
Image classification is one of the important techniques in computer vision. Due to the limited access of labeled samples in hyperspectral images, semi-supervised learning (SSL) methods have been widely applied in hyperspectral image classification. Graph based semi-supervised learning provides an effective solution to model data in classification problems, of which graph construction is the critical step. In this paper we employ the graphs constructed with a typical manifold learning method-locally linear embedding (LLE), based on which semi-supervised classification is then conducted. To exploit the valuable spatial information contained in hyperspectral images, discriminative spatial information (DSI) is then extracted. The proposed classification method is evaluated using three real hyperspectral data sets, revealing state-of-art performance when compared with different classification methods.  相似文献   

16.
目的 针对极化合成孔径雷达(polarimetric synthetic aperture radar,PolSAR)小样本分类问题,基于充分挖掘有限样本的极化、空间特征考虑,提出一种由高阶条件随机场(conditional random field,CRF)引导的多分支分类网络模型。方法 利用Yamaguchi非相干目标分解方法,构建每个像素的极化特征向量。设计了由高阶CRF能量函数引导的多卷积分支特征提取网络,将像素点极化特征向量作为输入,分别提取像素点的像素特征、邻域特征和位置特征信息。将以上特征进行加和融合,并输入到softmax分类器中得到预分类结果。利用超像素方法对预分类结果图进行进一步修正和调优,平滑相邻像素之间的特异性和相似性。结果 采用1%的采样率对两组真实的极化SAR数据进行测试。同时,为了更好地模拟实际应用中训练样本位置分布不均匀的情况,考虑了空间不相交采样方法作为对比实验。综合两种采样策略的实验结果表明,相较于只利用像素级特征或简单利用空间特征的方法,本文方法总分类精度平均提升7%~10%,不同地物类别的分类精准度均在90%以上,运行速度相比于支持向量机(support vector machine,SVM)提高了2.5倍以上。结论 通过构建高阶CRF引导的卷积神经网络,将像素特征信息、同质区域特征和地理位置信息进行融合,有效建立了像素级和对象级数据之间的尺度关联,进一步扩充了像素点之间的空间依赖性,提取到了更强大更准确的表征特征,显著提高了标记样本数量较少情况下的卷积网络模型的分类性能,进一步保证了地物目标散射机制表征的全面性和可靠性。  相似文献   

17.
This paper presents a system for weed mapping, using imagery provided by unmanned aerial vehicles (UAVs). Weed control in precision agriculture is based on the design of site-specific control treatments according to weed coverage. A key component is precise and timely weed maps, and one of the crucial steps is weed monitoring, by ground sampling or remote detection. Traditional remote platforms, such as piloted planes and satellites, are not suitable for early weed mapping, given their low spatial and temporal resolutions. Nonetheless, the ultra-high spatial resolution provided by UAVs can be an efficient alternative. The proposed method for weed mapping partitions the image and complements the spectral information with other sources of information. Apart from the well-known vegetation indexes, which are commonly used in precision agriculture, a method for crop row detection is proposed. Given that crops are always organised in rows, this kind of information simplifies the separation between weeds and crops. Finally, the system incorporates classification techniques for the characterisation of pixels as crop, soil and weed. Different machine learning paradigms are compared to identify the best performing strategies, including unsupervised, semi-supervised and supervised techniques. The experiments study the effect of the flight altitude and the sensor used. Our results show that an excellent performance is obtained using very few labelled data complemented with unlabelled data (semi-supervised approach), which motivates the use of weed maps to design site-specific weed control strategies just when farmers implement the early post-emergence weed control.  相似文献   

18.
Block compressed sensing (BCS) has great potential in image compression applications for its low storage requirement and low computational complexity. However, the sampling efficiency of traditional BCS is very poor since some blocks actually are not sparse enough to apply compressed sensing (CS). In order to improve the sampling efficiency, a novel BCS with random permutation and reweighted sampling (BCS-RP-RS) for image compression applications is proposed. In the proposed method, two effective strategies, including random permutation and reweighted sampling, are used simultaneously to guarantee all blocks of image signals sparse enough to apply CS. As a result, better sampling efficiency can be achieved. Simulation results show that the proposed approach improves the peak signal-to-noise ratio (PSNR) of the reconstructed-images significantly compared with the conventional BCS with random permutation (BCS-RP) approach.  相似文献   

19.
王朔琛  汪西莉 《计算机应用》2015,35(10):2974-2979
半监督复合核支持向量机在构造聚类核时,普遍存在复杂度高、不适于大规模图像分类的问题;且K均值(K-means)图像聚类的参数难以估计。针对上述问题,提出基于均值漂移(Mean-Shift)参数自适应的半监督复合核支持向量机图像分类方法。结合Mean-Shift对像素点进行聚类分析以避免K-means图像聚类的局限性;利用图像的结构特征自适应算法参数以避免算法的波动性;由Mean-Shift结果构造Mean Map聚类核以增强同一聚类中的样本属于同一类别的可能性,使复合核更好地指导支持向量机对图像分类。实验验证了改进的聚类算法和参数取值方法可以更好地获取图像的聚类信息,使算法对普通图像和加噪图像的分类正确率较对比的半监督算法一般情况下提高1~7个百分点,且对于较大规模图像也有一定适用性,能够更高效、更稳定地进行图像分类。  相似文献   

20.
Locality preserving projection (LPP) is a popular unsupervised feature extraction (FE) method. In this paper, the spatial-spectral LPP (SSLPP) method is proposed, which uses both the spectral and spatial information of hyperspectral image (HSI) for FE. The proposed method consists of two parts. In the first part, unlabelled samples are selected in a spatially homogeneous neighbourhood from filtered HSI. In the second part, the transformation matrix is calculated by an LPP-based method and by using the spectral and spatial information of the selected unlabelled samples. Experimental results on Indian Pines (IP), Kennedy Space Center (KSC), and Pavia University (PU) datasets show that the performance of SSLPP is superior to spectral unsupervised, supervised, and semi-supervised FE methods in small and large sample size situations. Moreover, the proposed method outperforms other spatial-spectral semi-supervised FE methods for PU dataset, which has high spatial resolution. For IP and KSC datasets, spectral regularized local discriminant embedding (SSRLDE) has the best performance by using spectral and spatial information of labelled and unlabelled samples, and SSLPP is ranked just behind it. Experiments show that SSLPP is an efficient unsupervised FE method, which does not use training samples as preparation of them is so difficult, costly, and sometimes impractical. SSLPP results are much better than LPP. Also, it decreases the storage and calculation costs using less number of unlabelled samples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号