首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
本文提出一种基于半监督主动学习的算法,用于解决在建立动态贝叶斯网络(DBN)分类模型时遇到的难以获得大量带有类标注的样本数据集的问题.半监督学习可以有效利用未标注样本数据来学习DBN分类模型,但是在迭代过程中易于加入错误的样本分类信息,并因而影响模型的准确性.在半监督学习中借鉴主动学习,可以自主选择有用的未标注样本来请求用户标注.把这些样本加入训练集之后,能够最大程度提高半监督学习对未标注样本分类的准确性.实验结果表明,该算法能够显著提高DBN学习器的效率和性能,并快速收敛于预定的分类精度.  相似文献   

2.
对于建立动态贝叶斯网络(DBN)分类模型时,带有类标注样本数据集获得困难的问题,提出一种基于EM和分类损失的半监督主动DBN学习算法.半监督学习中的EM算法可以有效利用未标注样本数据来学习DBN分类模型,但是由于迭代过程中易于加入错误的样本分类信息而影响模型的准确性.基于分类损失的主动学习借鉴到EM学习中,可以自主选择有用的未标注样本来请求用户标注,当把这些样本加入训练集后能够最大程度减少模型对未标注样本分类的不确定性.实验表明,该算法能够显著提高DBN学习器的效率和性能,并快速收敛于预定的分类精度.  相似文献   

3.
卷积神经网络(CNN)在半监督学习中取得了良好的成绩,其在训练阶段既利用有标记样本,也利用无标记样本帮助规范化学习模型。为进一步加强半监督模型的特征学习能力,提高其在图像分类时的性能表现,本文提出一种联合深度半监督卷积神经网络和字典学习的端到端半监督学习方法,称为Semi-supervised Learning based on Sparse Coding and Convolution(SSSConv);该算法框架旨在学习到鉴别性更强的图像特征表示。SSSConv首先利用CNN提取特征,并对所提取特征进行正交投影变换,下一步通过学习其稀疏编码的低维嵌入以得到图像的特征表示,最后据此进行分类。整个模型框架可进行端到端的半监督学习训练,CNN提取特征部分和稀疏编码字典学习部分具有统一的损失函数,目标一致。本文利用共轭梯度下降算法、链式法则和反向传播等算法对目标函数的参数进行优化,将稀疏编码的相关参数约束于流形上,CNN参数既可定义在欧氏空间,也可以进一步定义在正交空间中。基于半监督分类任务的实验结果验证了所提出SSSConv框架的有效性,与现有方法相比具有较强的竞争力。  相似文献   

4.
针对海量多源异构且数据分布不平衡的网络入侵检测问题以及传统深度学习算法无法根据实时入侵情况在线更新其输出权重的问题,提出了一种基于深度序列加权核极限学习的入侵检测算法(DBN-WOS-KELM算法)。该算法先使用深度信念网络DBN对历史数据进行学习,完成对原始数据的特征提取和数据降维,再利用加权序列核极限学习机进行监督学习完成入侵识别,结合了深度信念网络提取抽象特征的能力以及核极限学习机的快速学习能力。最后在部分KDD99数据集上进行了仿真实验,实验结果表明DBN-WOS-KELM算法提高了对小样本攻击的识别率,并且能够根据实际情况在线更新输出权重,训练效率更高。  相似文献   

5.
目前针对未知的Android恶意应用可以采用机器学习算法进行检测,但传统的机器学习算法具有少于三层的计算单元,无法充分挖掘Android应用程序特征深层次的表达。文中首次提出了一种基于深度学习的算法DDBN (Data-flow Deep BeliefNetwork)对Android应用程序数据流特征进行分析,从而检测Android未知恶意应用。首先,使用分析工具FlowDroid和SUSI提取能够反映Android应用恶意行为的静态数据流特征;然后,针对该特征设计了数据流深度学习算法DDBN,该算法通过构建深层的模型结构,并进行逐层特征变换,将数据流在原空间的特征表示变换到新的特征空间,从而使分类更加准确;最后,基于DDBN实现了Android恶意应用检测工具Flowdect,并对现实中的大量安全应用和恶意应用进行检测。实验结果表明,Flowdect能够充分学习Android应用程序的数据流特征,用于检测未知的Android恶意应用。通过与其他基于传统机器学习算法的检测方案对比,DDBN算法具有更优的检测效果。  相似文献   

6.
In this paper we study statistical properties of semi-supervised learning, which is considered to be an important problem in the field of machine learning. In standard supervised learning only labeled data is observed, and classification and regression problems are formalized as supervised learning. On the other hand, in semi-supervised learning, unlabeled data is also obtained in addition to labeled data. Hence, the ability to exploit unlabeled data is important to improve prediction accuracy in semi-supervised learning. This problem is regarded as a semiparametric estimation problem with missing data. Under discriminative probabilistic models, it was considered that unlabeled data is useless to improve the estimation accuracy. Recently, the weighted estimator using unlabeled data achieves a better prediction accuracy compared to the learning method using only labeled data, especially when the discriminative probabilistic model is misspecified. That is, improvement under the semiparametric model with missing data is possible when the semiparametric model is misspecified. In this paper, we apply the density-ratio estimator to obtain the weight function in semi-supervised learning. Our approach is advantageous because the proposed estimator does not require well-specified probabilistic models for the probability of the unlabeled data. Based on statistical asymptotic theory, we prove that the estimation accuracy of our method outperforms supervised learning using only labeled data. Some numerical experiments present the usefulness of our methods.  相似文献   

7.
基于集成学习的半监督情感分类方法研究   总被引:1,自引:0,他引:1  
情感分类旨在对文本所表达的情感色彩类别进行分类的任务。该文研究基于半监督学习的情感分类方法,即在很少规模的标注样本的基础上,借助非标注样本提高情感分类性能。为了提高半监督学习能力,该文提出了一种基于一致性标签的集成方法,用于融合两种主流的半监督情感分类方法:基于随机特征子空间的协同训练方法和标签传播方法。首先,使用这两种半监督学习方法训练出的分类器对未标注样本进行标注;其次,选取出标注一致的未标注样本;最后,使用这些挑选出的样本更新训练模型。实验结果表明,该方法能够有效降低对未标注样本的误标注率,从而获得比任一种半监督学习方法更好的分类效果。  相似文献   

8.
Recently, deep learning methodologies have become popular to analyse physiological signals in multiple modalities via hierarchical architectures for human emotion recognition. In most of the state-of-the-arts of human emotion recognition, deep learning for emotion classification was used. However, deep learning is mostly effective for deep feature extraction. Therefore, in this research, we applied unsupervised deep belief network (DBN) for depth level feature extraction from fused observations of Electro-Dermal Activity (EDA), Photoplethysmogram (PPG) and Zygomaticus Electromyography (zEMG) sensors signals. Afterwards, the DBN produced features are combined with statistical features of EDA, PPG and zEMG to prepare a feature-fusion vector. The prepared feature vector is then used to classify five basic emotions namely Happy, Relaxed, Disgust, Sad and Neutral. As the emotion classes are not linearly separable from the feature-fusion vector, the Fine Gaussian Support Vector Machine (FGSVM) is used with radial basis function kernel for non-linear classification of human emotions. Our experiments on a public multimodal physiological signal dataset show that the DBN, and FGSVM based model significantly increases the accuracy of emotion recognition rate as compared to the existing state-of-the-art emotion classification techniques.  相似文献   

9.
提出一种选择最富信息数据并予以标记的基于主动学习策略的半监督聚类算法。首先, 采用传统K-均值聚类算法对数据集进行粗聚类; 其次, 根据粗聚类结果计算出每个数据隶属于每个类簇的隶属度, 筛选出满足最大与次大隶属度差值小于阈值的候选数据, 并从中选择差值较小的数据作为最富信息的数据进行标记; 最后, 将候选数据集合中未标记数据分组到与每类已被标记数据平均距离最小的类簇中。实验表明, 提出的主动学习策略能够很好地学习到最富信息数据, 基于该学习策略的半监督聚类算法在测试不同数据集时均获得了较高的准确率。  相似文献   

10.
This paper presents a method for designing semi-supervised classifiers trained on labeled and unlabeled samples. We focus on probabilistic semi-supervised classifier design for multi-class and single-labeled classification problems, and propose a hybrid approach that takes advantage of generative and discriminative approaches. In our approach, we first consider a generative model trained by using labeled samples and introduce a bias correction model, where these models belong to the same model family, but have different parameters. Then, we construct a hybrid classifier by combining these models based on the maximum entropy principle. To enable us to apply our hybrid approach to text classification problems, we employed naive Bayes models as the generative and bias correction models. Our experimental results for four text data sets confirmed that the generalization ability of our hybrid classifier was much improved by using a large number of unlabeled samples for training when there were too few labeled samples to obtain good performance. We also confirmed that our hybrid approach significantly outperformed generative and discriminative approaches when the performance of the generative and discriminative approaches was comparable. Moreover, we examined the performance of our hybrid classifier when the labeled and unlabeled data distributions were different.  相似文献   

11.
Image classification is one of the important techniques in computer vision. Due to the limited access of labeled samples in hyperspectral images, semi-supervised learning (SSL) methods have been widely applied in hyperspectral image classification. Graph based semi-supervised learning provides an effective solution to model data in classification problems, of which graph construction is the critical step. In this paper we employ the graphs constructed with a typical manifold learning method-locally linear embedding (LLE), based on which semi-supervised classification is then conducted. To exploit the valuable spatial information contained in hyperspectral images, discriminative spatial information (DSI) is then extracted. The proposed classification method is evaluated using three real hyperspectral data sets, revealing state-of-art performance when compared with different classification methods.  相似文献   

12.
为有效使用大量未标注的图像进行分类,提出一种基于半监督学习的图像分类方法。通过共同的隐含话题桥接少量已标注的图像和大量未标注的图像,利用已标注图像的Must-link约束和Cannot-link约束提高未标注图像分类的精度。实验结果表明,该方法有效提高Caltech-101数据集和7类图像集约10%的分类精度。此外,针对目前绝大部分半监督图像分类方法不具备增量学习能力这一缺点,提出该方法的增量学习模型。实验结果表明,增量学习模型相比无增量学习模型提高近90%的计算效率。关键词半监督学习,图像分类,增量学习中图法分类号TP391。41IncrementalImageClassificationMethodBasedonSemi-SupervisedLearningLIANGPeng1,2,LIShao-Fa2,QINJiang-Wei2,LUOJian-Gao31(SchoolofComputerScienceandEngineering,GuangdongPolytechnicNormalUniversity,Guangzhou510665)2(SchoolofComputerScienceandEngineering,SouthChinaUniversityofTechnology,Guangzhou510006)3(DepartmentofComputer,GuangdongAIBPolytechnicCollege,Guangzhou510507)ABSTRACTInordertouselargenumbersofunlabeledimageseffectively,animageclassificationmethodisproposedbasedonsemi-supervisedlearning。Theproposedmethodbridgesalargeamountofunlabeledimagesandlimitednumbersoflabeledimagesbyexploitingthecommontopics。Theclassificationaccuracyisimprovedbyusingthemust-linkconstraintandcannot-linkconstraintoflabeledimages。TheexperimentalresultsonCaltech-101and7-classesimagedatasetdemonstratethattheclassificationaccuracyimprovesabout10%bytheproposedmethod。Furthermore,duetothepresentsemi-supervisedimageclassificationmethodslackingofincrementallearningability,anincrementalimplementationofourmethodisproposed。Comparingwithnon-incrementallearningmodelinliterature,theincrementallearningmethodimprovesthecomputationefficiencyofnearly90%。  相似文献   

13.
基于一致性的半监督学习方法通常使用简单的数据增强方法来实现对原始输入和扰动输入的一致性预测.在有标签数据的比例较低的情况下,该方法的效果难以得到保证.将监督学习中一些先进的数据增强方法扩展到半监督学习环境中,是解决该问题的思路之一.基于一致性的半监督学习方法MixMatch,提出了基于混合样本自动数据增强技术的半监督学...  相似文献   

14.
深入研究了城市物流效率分析的研究现状,结合深度学习相关理论,针对具体问题构建三隐层连续型深度信念网络(DBN),对网络知识集进行了定义,提出了自适应DBN算法,分析了算法的收敛性。利用Iris数据集和Wine数据集验证了网络及算法的模式分类能力,分类精度高于双隐层深度信念网络与深度误差反向传播网络。根据新丝绸之路经济带沿线城市物流特点,以物流效率为评估目标,选取4个维度的13项指标建立评价指标体系,以20个核心节点城市为研究对象,利用自适应DBN算法和社会网络分析法(SNA)进行聚类分析,结果表明自适应DBN算法相对更为合理有效。研究结果为确定新丝绸之路经济带沿线城市物流发展策略、促进国内物流业未来的协作与发展奠定了研究基础。  相似文献   

15.
Developing methods for designing good classifiers from labeled samples whose distribution is different from that of test samples is an important and challenging research issue in the fields of machine learning and its application. This paper focuses on designing semi-supervised classifiers with a high generalization ability by using unlabeled samples drawn by the same distribution as the test samples and presents a semi-supervised learning method based on a hybrid discriminative and generative model. Although JESS-CM is one of the most successful semi-supervised classifier design frameworks based on a hybrid approach, it has an overfitting problem in the task setting that we consider in this paper. We propose an objective function that utilizes both labeled and unlabeled samples for the discriminative training of hybrid classifiers and then expect the objective function to mitigate the overfitting problem. We show the effect of the objective function by theoretical analysis and empirical evaluation. Our experimental results for text classification using four typical benchmark test collections confirmed that with our task setting in most cases, the proposed method outperformed the JESS-CM framework. We also confirmed experimentally that the proposed method was useful for obtaining better performance when classifying data samples into either known or unknown classes, which were included in given labeled samples or not, respectively.  相似文献   

16.
将监督信息引入到聚类算法中去,在先前提出的鲁棒联机聚类算法(ROC)的基础上,通过引入以样本类标号形式给出的监督信息,提出了一种半监督的鲁棒联机聚类算法(Semi-ROC).在算法的聚类精度和鲁棒性能上,算法Semi-ROC比ROC和AddC有着更好的性能,在人工数据集和UCI标准数据集上的实验结果表明,Semi-ROC能有效地利用少量的监督信息来提高算法的聚类性能,得到较优的结果.另外,在添加噪声的情况下,算法Semi-ROC比原始的联机聚类算法AddC和ROC都更加鲁棒.  相似文献   

17.
朱常宝  程勇  高强 《计算机科学》2016,43(Z6):46-50
近年来,深度学习在图像、语音、视频等非结构化数据中获得了成功的应用,已成为机器学习和数据挖掘领域的研究热点。作为一种监督学习模型,成功的深度学习应用往往要求较大的高质量的训练集。基于此,研究了多个受限波尔兹曼机组成的深度信念网络,结合半监督学习的思想,使用较小的训练集提高深度网络模型的分类准确性。分别采用了Knn,SVM和pHash 3种方法来学习非标示数据集,实验结果表明半监督深度信念网络比传统多层受限波尔兹曼机在图像分类准确率方面提高了约3%。  相似文献   

18.
一种用于图像分类的多视觉短语学习方法   总被引:2,自引:0,他引:2  
针对词袋图像表示模型的语义区分性和描述能力有限的问题,以及由于传统的基于词袋模型的分类方法性能容易受到图像中背景、遮挡等因素影响的问题,本文提出了一种用于图像分类的多视觉短语学习方法.通过构建具有语义区分性和空间相关性的视觉短语取代视觉单词,以改善图像的词袋模型表示的准确性.在此基础上,结合多示例学习思想,提出一种多视觉短语学习方法,使最终的分类模型能反映图像类别的区域特性.在一些标准测试集合如Calrech-101[1]和Scene-15[2]上的实验结果验证了本文所提方法的有效性,分类性能分别相对提高了约9%和7%.  相似文献   

19.
基于DBN模型的遥感图像分类   总被引:4,自引:0,他引:4  
遥感图像分类是地理信息系统(geographic information system, GIS)的关键技术,对城市规划与管理起到十分重要的作用.近年来,深度学习成为机器学习领域的一个新兴研究方向.深度学习采用模拟人脑多层结构的方式,对数据从低层到高层渐进地进行特征提取,从而发掘数据在时间与空间上的规律,进而提高分类的准确性.深度信念网络(deep belief network, DBN)是一种得到广泛研究与应用的深度学习模型,它结合了无监督学习和有监督学习的优点,对高维数据具有较好的分类能力.提出一种基于DBN模型的遥感图像分类方法,并利用RADARSAT-2卫星6d的极化合成孔径雷达(synthetic aperture radar, SAR)图像进行了验证.实验表明,与支持向量机(SVM)及传统的神经网络(NN)方法相比,基于DBN模型的方法可以取得更好的分类效果.  相似文献   

20.
由于人类语言的复杂性,文本情感分类算法大多都存在因为冗余而造成的词汇量过大的问题。深度信念网络(DBN)通过学习输入语料中的有用信息以及它的几个隐藏层来解决这个问题。然而对于大型应用程序来说,DBN是一个耗时且计算代价昂贵的算法。针对这个问题,提出了一种半监督的情感分类算法,即基于特征选择和深度信念网络的文本情感分类算法(FSDBN)。首先使用特征选择方法(文档频率(DF)、信息增益(IG)、卡方统计(CHI)、互信息(MI))过滤掉一些不相关的特征从而使词汇表的复杂性降低;然后将特征选择的结果输入到DBN中,使得DBN的学习阶段更加高效。将所提算法应用到中文以及维吾尔语中,实验结果表明在酒店评论数据集上,FSDBN在准确率方面比DBN提高了1.6%,在训练时间上比DBN缩短一半。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号