首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 921 毫秒
1.
工业控制系统与物理环境联系紧密,受到攻击会直接造成经济损失,人员伤亡等后果,工业控制系统入侵检测可以提供有效的安全防护。工业控制系统中将入侵检测作为一个异常检测问题,本文围绕PU learning (Positive-unlabeled learning,PU学习)进行工业控制系统入侵检测进行研究。首先针对工业控制系统中数据维度高的特点,提出了一种特征重要度计算方法,通过正例数据集和无标签数据集的分布差异度量特征重要度,用于PU学习的特征选择;其次提出了一种基于OCSVM (One-Class SVM)的类先验估计算法,该算法可以稳定且准确的估计出类先验概率,为PU学习提供必要的先验知识;最后采用了三个公开数据集进行实验,在仅有一类标签数据的条件下,通过PU学习发现待检测数据中的异常样本,并与一些现有的模型进行对比,验证了PU学习的有效性。  相似文献   

2.
李婷婷  吕佳  范伟亚 《计算机应用》2019,39(10):2822-2828
正例无标记(PU)学习中的间谍技术极易受噪声和离群点干扰,导致划分的可靠正例不纯,且在初始正例中随机选择间谍样本的机制极易造成划分可靠负例时效率低下,针对这些问题提出一种结合新型间谍技术和半监督自训练的PU学习框架。首先,该框架对初始有标记样本进行聚类并选取离聚类中心较近的样本来取代间谍样本,这些样本能有效地映射出无标记样本的分布结构,从而更好地辅助选取可靠负例;然后对间谍技术划分后的可靠正例进行自训练提纯,采用二次训练的方式取回被误分为正例样本的可靠负例。该框架有效地解决了传统间谍技术在PU学习中分类效率易受数据分布干扰以及随机间谍样本影响的问题。通过9个标准数据集上的仿真实验结果表明,所提框架的平均分类准确率和F-值均高于基本PU学习算法(Basic_PU)、基于间谍技术的PU学习算法(SPY)、基于朴素贝叶斯的自训练PU学习算法(NBST)和基于迭代剪枝的PU学习算法(Pruning)。  相似文献   

3.
Traditional supervised classifiers use only labeled data (features/label pairs) as the training set, while the unlabeled data is used as the testing set. In practice, it is often the case that the labeled data is hard to obtain and the unlabeled data contains the instances that belong to the predefined class but not the labeled data categories. This problem has been widely studied in recent years and the semi-supervised PU learning is an efficient solution to learn from positive and unlabeled examples. Among all the semi-supervised PU learning methods, it is hard to choose just one approach to fit all unlabeled data distribution. In this paper, a new framework is designed to integrate different semi-supervised PU learning algorithms in order to take advantage of existing methods. In essence, we propose an automatic KL-divergence learning method by utilizing the knowledge of unlabeled data distribution. Meanwhile, the experimental results show that (1) data distribution information is very helpful for the semi-supervised PU learning method; (2) the proposed framework can achieve higher precision when compared with the state-of-the-art method.  相似文献   

4.
Liu  Bo  Liu  Qian  Xiao  Yanshan 《Applied Intelligence》2022,52(3):2465-2479

Positive and unlabeled learning (PU learning) has been studied to address the situation in which only positive and unlabeled examples are available. Most of the previous work has been devoted to identifying negative examples from the unlabeled data, so that the supervised learning approaches can be applied to build a classifier. However, for the remaining unlabeled data, they either exclude them from the learning phase or force them to belong to a class, and this always limits the performance of PU learning. In addition, previous PU methods assume the training data and the testing data have the same features representations. However, we can always collect the features that the training data have while the test data do not have, these kinds of features are called privileged information. In this paper, we propose a new method, which is based on similarity approach for the problem of positive and unlabeled learning with privileged information (SPUPIL), which consists of two steps. The proposed SPUPIL method first conducts KNN method to generate the similarity weights and then the similarity weights and privileged information are incorporated to the learning model based on Ranking SVM to build a more accurate classifier. We also use the Lagrangian method to transform the original model into its dual problem, and solve it to obtain the classifier. Extensive experiments on the real data sets show that the performance of the SPUPIL is better than the state-of-the-art PU learning methods.

  相似文献   

5.
目前基于PU问题的时间序列分类常采用半监督学习对未标注数据集[U]中数据进行自动标注并构建分类器,但在这种方法中,边界数据样本类别的自动标注难以保证正确性,从而导致构建分类器的效果不佳。针对以上问题,提出一种采用主动学习对未标注数据集[U]中数据进行人工标注从而构建分类器的方法OAL(Only Active Learning),基于投票委员会(QBC)对标注数据集构建多个分类器进行投票,以计算未标注数据样本的类别不一致性,并综合考虑数据样本的分布密度,计算数据样本的信息量,作为主动学习的数据选择策略。鉴于人工标注数据量有限,在上述OAL方法的基础上,将主动学习与半监督学习相结合,即在主动学习迭代过程中,将类别一致性高的部分数据样本自动标注,以增加训练数据中标注数据量,保证构建分类器的训练数据量。实验表明了该方法通过部分人工标注,相比半监督学习,能够为PU数据集构建更高准确率的分类器。  相似文献   

6.
Podcasting has been used widely to support individuals' learning activities. However, most of the research focuses its use in formal educational contexts. Little attention has been paid to the use of podcasting in organizational settings to support employees' learning activities. To address this gap, this research investigates employees' perceived usefulness (PU) of podcasting to facilitate their learning activities within organizational settings. Using a global company as the case study, the data collected through semi-structured interviews were analyzed using qualitative techniques. The study finds that the characteristics of the information delivered by the podcasts (i.e., information overload, information privacy, and information relevance) play an important role in shaping employees' PU to adopt podcasting for learning. Excitement toward the technology and tenure are also found to have an impact. In fact, contrary to prior findings, which showed the importance of emotions occurring during the use of technologies, towards their ultimate adoption, this study finds that emotions (excitement in our case) in anticipation of podcasting implementation plays a significant role in individual's PU towards adoption. Further, we develop a set of propositions to discuss the relationships between these factors and the PU of podcasting in organizational settings. Practical and theoretical implications are discussed.  相似文献   

7.
Wang  Yijin  Peng  Yali  Liu  Shigang  Ge  Bao  Li  Jun 《Pattern Analysis & Applications》2023,26(3):1253-1263
Pattern Analysis and Applications - With the recent surge of interest in machine learning, Positive and Unlabeled learning (PU learning) has also attracted much attention of scholars. A key...  相似文献   

8.
正未标记学习仅使用无标签样本和正样本训练一个二分类器, 而生成式对抗网络(generative adversarial networks, GAN)中通过对抗性训练得到一个图像生成器. 为将GAN的对抗训练方法迁移到正未标记学习中以提升正未标记学习的效果, 可将GAN中的生成器替换为分类器C, 在无标签数据集中挑选样本以欺骗判别器D, 对CD进行迭代优化. 本文提出基于以Jensen-Shannon散度(JS散度)为目标函数的JS-PAN模型. 最后, 结合数据分布特点及现状需求, 说明了PAN模型在医疗诊断图像二分类应用的合理性及高性能. 在MNIST, CIFAR-10数据集上的实验结果显示: KL-PAN模型与同类正未标记学习模型对比有更高的精确度(ACC)及F1-score; 对称化改进后, JS-PAN模型在两个指标上均有所提升, 因此JS-PAN模型的提出更具有合理性. 在Med-MNIST的3个子图像数据集上的实验显示: KL-PAN模型与4个benchmark有监督模型有几乎相同的ACC, JS-PAN也有更高表现. 因此, 综合PAN模型的出色分类效果及医疗诊断数据的分布特征, PAN作为半监督学习方法可获得更快、更好的效果, 在医学图像的二分类的任务上具有更高的性能.  相似文献   

9.
Bekker  Jessa  Davis  Jesse 《Machine Learning》2020,109(4):719-760

Learning from positive and unlabeled data or PU learning is the setting where a learner only has access to positive examples and unlabeled data. The assumption is that the unlabeled data can contain both positive and negative examples. This setting has attracted increasing interest within the machine learning literature as this type of data naturally arises in applications such as medical diagnosis and knowledge base completion. This article provides a survey of the current state of the art in PU learning. It proposes seven key research questions that commonly arise in this field and provides a broad overview of how the field has tried to address them.

  相似文献   

10.
Learning from positive and unlabeled examples (PU learning) is a partially supervised classification that is frequently used in Web and text retrieval system. The merit of PU learning is that it can get good performance with less manual work. Motivated by transfer learning, this paper presents a novel method that transfers the ‘outdated data’ into the process of PU learning. We first propose a way to measure the strength of the features and select the strong features and the weak features according to the strength of the features. Then, we extract the reliable negative examples and the candidate negative examples using the strong and the weak features (Transfer‐1DNF). Finally, we construct a classifier called weighted voting iterative support vector machine (SVM) that is made up of several subclassifiers by applying SVM iteratively, and each subclassifier is assigned a weight in each iteration. We conduct the experiments on two datasets: 20 Newsgroups and Reuters‐21578, and compare our method with three baseline algorithms: positive example‐based learning, weighted voting classifier and SVM. The results show that our proposed method Transfer‐1DNF can extract more reliable negative examples with lower error rates, and our classifier outperforms the baseline algorithms. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

11.
Online configuration tool kits present attractive opportunities for creating customized offers. The purpose of this study is to analyze the relevance of both the TAM (Technology Acceptance Model) and the experiential learning theory to understand how configuration contributes to provide value for end users. The qualitative survey shows that the TAM needs to be adapted to the case of configurators by measuring both perceived usefulness (PU) of the configured product and PU and ease of use of the configurator itself. The study shows the relevance of experiential learning theory as an antecedent of the TAM. A model summarizing our observations is proposed and discussed.  相似文献   

12.
A learning machine, called a clustering interpreting probabilistic associative memory (CIPAM), is proposed. CIPAM consists of a clusterer and an interpreter. The clusterer is a recurrent hierarchical neural network of unsupervised processing units (UPUs). The interpreter is a number of supervised processing units (SPUs) that branch out from the clusterer. Each processing unit (PU), UPU or SPU, comprises “dendritic encoders” for encoding inputs to the PU, “synapses” for storing resultant codes, a “nonspiking neuron” for generating inhibitory graded signals to modulate neighboring spiking neurons, “spiking neurons” for computing the subjective probability distribution (SPD) or the membership function, in the sense of fuzzy logic, of the label of said inputs to the PU and generating spike trains with the SPD or membership function as the firing rates, and a masking matrix for maximizing generalization. While UPUs employ unsupervised covariance learning mechanisms, SPUs employ supervised ones. They both also have unsupervised accumulation learning mechanisms. The clusterer of CIPAM clusters temporal and spatial data. The interpreter interprets the resultant clusters, effecting detection and recognition of temporal and hierarchical causes.  相似文献   

13.
Yoo  Jaemin  Kim  Junghun  Yoon  Hoyoung  Kim  Geonsoo  Jang  Changwon  Kang  U 《Knowledge and Information Systems》2022,64(8):2141-2169
Knowledge and Information Systems - How can we classify graph-structured data only with positive labels? Graph-based positive-unlabeled (PU) learning is to train a binary classifier given only the...  相似文献   

14.
In machine learning, positive-unlabelled (PU) learning is a special case within semi-supervised learning. In positive-unlabelled learning, the training set contains some positive examples and a set of unlabelled examples from both the positive and negative classes. Positive-unlabelled learning has gained attention in many domains, especially in time-series data, in which the obtainment of labelled data is challenging. Examples which originate from the negative class are especially difficult to acquire. Self-learning is a semi-supervised method capable of PU learning in time-series data. In the self-learning approach, observations are individually added from the unlabelled data into the positive class until a stopping criterion is reached. The model is retrained after each addition with the existent labels. The main problem in self-learning is to know when to stop the learning. There are multiple, different stopping criteria in the literature, but they tend to be inaccurate or challenging to apply. This publication proposes a novel stopping criterion, which is called Peak evaluation using perceptually important points, to address this problem for time-series data. Peak evaluation using perceptually important points is exceptional, as it does not have tunable hyperparameters, which makes it easily applicable to an unsupervised setting. Simultaneously, it is flexible as it does not make any assumptions on the balance of the dataset between the positive and the negative class.   相似文献   

15.
Learning management system (LMS) is playing a major role in higher academic institutions worldwide. Even though full e-learning is becoming a feasible strategy for a number of institutions in the world, some institutions, especially those in developing countries, are resisting a full e-learning environment. Consequently, these academic institutions initially adopt LMS for blended learning to assess their readiness for full e-learning transformation. There are a number of studies that investigate the determinants of full e-learning, but very limited studies investigate the link between learners’ perception of blended learning and full e-learning. The objective of this study was to link learners’ adoption (perceived ease of use, perceived usefulness (PU) and satisfaction) of LMS in blended learning and their personal characteristics (self-efficacy, technology experience and personal innovativeness) to their intention to use full e-learning. Data were collected through a questionnaire from 512 learners in Oman. The study found that personal innovativeness, PU and satisfaction of LMS in blended learning are significant to learners’ intention to engage in full e-learning. Thus, learners’ adoption of LMS in blended learning boosts their intention to full e-learning. The results provide useful insights for practitioners and researchers on full e-learning planning and strategy.  相似文献   

16.
为了在仅有正例和未标注样本的训练数据集下进行机器学习(PU学习,Positive Unlabeled Learning),提出一种可用于PU学习的平均n依赖决策树(P-AnDT)分类算法。首先在构造决策树时,选取样本的n个属性作为依赖属性,在每个分裂属性上,计算依赖属性和类别属性的共同影响;然后分别选用不同的输入属性作为依赖属性,建立多个有差异的分类器并对结果求平均值,构造集成分类算法。最终通过估计正例在数据集中的比例参数p,使该算法能够在PU学习场景下进行分类。在多组UCI数据集上的实验结果表明,与基于贝叶斯假设的PU学习算法(PNB、PTAN等算法)相比,P-AnDT算法有更好更稳定的分类准确率。  相似文献   

17.
陈文  晏立  周亮 《计算机工程》2011,37(4):214-215
在正例和无标记样本增量学习中,初始正例样本较少且不同类别正例的反例获取困难,使分类器的分类和泛化能力不强,为解决上述问题,提出一种具有增量学习能力的PU主动学习算法,在使用3个支持向量机进行协同半监督学习的同时,利用基于网格的聚类方法进行无监督学习,当分类与聚类结果不一致时,引入主动学习对无标记样本进行标记。实验结果表明,将该算法应用于Deep Web入口的在线判断和分类能有效提高入口判断的准确性及分类的正确性。  相似文献   

18.
P.A.  C.  M.  J.C.   《Neurocomputing》2009,72(13-15):2731
This paper proposes a hybrid neural network model using a possible combination of different transfer projection functions (sigmoidal unit, SU, product unit, PU) and kernel functions (radial basis function, RBF) in the hidden layer of a feed-forward neural network. An evolutionary algorithm is adapted to this model and applied for learning the architecture, weights and node typology. Three different combined basis function models are proposed with all the different pairs that can be obtained with SU, PU and RBF nodes: product–sigmoidal unit (PSU) neural networks, product–radial basis function (PRBF) neural networks, and sigmoidal–radial basis function (SRBF) neural networks; and these are compared to the corresponding pure models: product unit neural network (PUNN), multilayer perceptron (MLP) and the RBF neural network. The proposals are tested using ten benchmark classification problems from well known machine learning problems. Combined functions using projection and kernel functions are found to be better than pure basis functions for the task of classification in several datasets.  相似文献   

19.
针对不确定正例和未标记学习的最近邻算法(英文)   总被引:1,自引:0,他引:1       下载免费PDF全文
研究了在正例和未标记样本场景下不确定样本的分类问题,提出了一种新的算法NNPU(nearest neighbor algorithm for positive and unlabeled learning)。NNPU具有两种实现方式:NNPUa和NNPUu。在UCI标准数据集上的实验结果表明,充分考虑数据不确定信息的NNPUu算法要比仅仅考虑样本中不确定信息均值的NNPUa算法具有更好的分类能力;同时,NNPU算法在对精确数据进行分类时,比NN-d、OCC以及aPUNB算法性能更优。  相似文献   

20.
We propose a novel method for smoothing partition of unity (PU) implicit surfaces consisting of sets of non-conforming linear functions with spherical supports. We derive new discrete differential operators and Laplacian smoothing using a spherical covering of PU as a grid-like data structure. These new differential operators are applied to the smoothing of PU implicit surfaces. First, Laplacian smoothing is performed for the vector field defined by the gradient of the PU implicit surface, which is then updated to reflect the smoothing of the gradient field. This process achieves a method for noise robust surface reconstruction from scattered points.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号