首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
Possibilistic networks, which are compact representations of possibility distributions, are powerful tools for representing and reasoning with uncertain and incomplete information in the framework of possibility theory. They are like Bayesian networks but lie on possibility theory to deal with uncertainty, imprecision and incompleteness. While classification is a very useful task in many real world applications, possibilistic network-based classification issues are not well investigated in general and possibilistic-based classification inference with uncertain observations in particular. In this paper, we address on one hand the theoretical foundations of inference in possibilistic classifiers under uncertain inputs and propose on the other hand a novel efficient algorithm for the inference in possibilistic network-based classification under uncertain observations. We start by studying and analyzing the counterpart of Jeffrey’s rule in the framework of possibility theory. After that, we address the validity of Markov-blanket criterion in the context of possibilistic networks used for classification with uncertain inputs purposes. Finally, we propose a novel algorithm suitable for possibilistic classifiers with uncertain observations without assuming any independence relations between observations. This algorithm guarantees the same results as if classification were performed using the possibilistic counterpart of Jeffrey’s rule. Classification is achieved in polynomial time if the target variable is binary. The basic idea of our algorithm is to only search for totally plausible class instances through a series of equivalent and polynomial transformations applied on the possibilistic classifier taking into account the uncertain observations.  相似文献   

2.
利用Copula的理论提出了基于Copula贝叶斯分类算法,克服了一般的朴素贝叶斯分类器要求属性独立性假设的不足,进一步扩展了朴素贝叶斯分类器,实验结果表明,基于Copula贝叶斯算法取得了较好的分类效果。  相似文献   

3.
朴素贝叶斯分类器是一种简单而高效的分类器,但是其属性独立性假设限制了对实际数据的应用。提出一种新的算法,该算法为避免数据预处理时,训练集的噪声及数据规模使属性约简的效果不太理想,并进而影响分类效果,在训练集上通过随机属性选取生成若干属性子集,并以这些子集构建相应的贝叶斯分类器,进而采用遗传算法进行优选。实验表明,与传统的朴素贝叶斯方法相比,该方法具有更好的分类精度。  相似文献   

4.
Possibilistic networks are graphical models particularly suitable for representing and reasoning with uncertain and incomplete information. According to the underlying interpretation of possibilistic scales, possibilistic networks are either quantitative (using product‐based conditioning) or qualitative (using min‐based conditioning). Among the multiple tasks, possibilitic models can be used for, classification is a very important one. In this paper, we address the problem of handling uncertain inputs in binary possibilistic‐based classification. More precisely, we propose an efficient algorithm for revising possibility distributions encoded by a naive possibilistic network. This algorithm is suitable for binary classification with uncertain inputs since it allows classification in polynomial time using several efficient transformations of initial naive possibilistic networks. © 2009 Wiley Periodicals, Inc.  相似文献   

5.
Bayesian networks are important knowledge representation tools for handling uncertain pieces of information. The success of these models is strongly related to their capacity to represent and handle dependence relations. Some forms of Bayesian networks have been successfully applied in many classification tasks. In particular, naive Bayes classifiers have been used for intrusion detection and alerts correlation. This paper analyses the advantage of adding expert knowledge to probabilistic classifiers in the context of intrusion detection and alerts correlation. As examples of probabilistic classifiers, we will consider the well-known Naive Bayes, Tree Augmented Naïve Bayes (TAN), Hidden Naive Bayes (HNB) and decision tree classifiers. Our approach can be applied for any classifier where the outcome is a probability distribution over a set of classes (or decisions). In particular, we study how additional expert knowledge such as “it is expected that 80 % of traffic will be normal” can be integrated in classification tasks. Our aim is to revise probabilistic classifiers’ outputs in order to fit expert knowledge. Experimental results show that our approach improves existing results on different benchmarks from intrusion detection and alert correlation areas.  相似文献   

6.
The Bayesian classifier is a fundamental classification technique. In this work, we focus on programming Bayesian classifiers in SQL. We introduce two classifiers: Naive Bayes and a classifier based on class decomposition using K-means clustering. We consider two complementary tasks: model computation and scoring a data set. We study several layouts for tables and several indexing alternatives. We analyze how to transform equations into efficient SQL queries and introduce several query optimizations. We conduct experiments with real and synthetic data sets to evaluate classification accuracy, query optimizations, and scalability. Our Bayesian classifier is more accurate than Naive Bayes and decision trees. Distance computation is significantly accelerated with horizontal layout for tables, denormalization, and pivoting. We also compare Naive Bayes implementations in SQL and C++: SQL is about four times slower. Our Bayesian classifier in SQL achieves high classification accuracy, can efficiently analyze large data sets, and has linear scalability.  相似文献   

7.
属性加权的朴素贝叶斯集成分类器   总被引:2,自引:1,他引:1  
为提高朴素贝叶斯分类器的分类精度和泛化能力,提出了基于属性相关性的加权贝叶斯集成方法(WEBNC)。根据每个条件属性与决策属性的相关度对其赋以相应的权值,然后用AdaBoost训练属性加权后的BNC。该分类方法在16个UCI标准数据集上进行了测试,并与BNC、贝叶斯网和由AdaBoost训练出的BNC进行比较,实验结果表明,该分类器具有更高的分类精度与泛化能力。  相似文献   

8.
用Matlab语言建构贝叶斯分类器   总被引:2,自引:1,他引:2  
文本分类是文本挖掘的基础与核心,分类器的构建是文本分类的关键,利用贝叶斯网络可以构造出分类性能较好的分类器。文中利用Matlab构造出了两种分类器:朴素贝叶斯分类器NBC,用互信息测度和条件互信息测度构建了TANC。用UCI上下载的标准数据集验证所构造的分类器,实验结果表明,所建构的几种分类器的性能总体比文献中列的高些,从而表明所建立的分类器的有效性和正确性。笔者对所建构的分类器进行优化并应用于文本分类中。  相似文献   

9.
For learning a Bayesian network classifier, continuous attributes usually need to be discretized. But the discretization of continuous attributes may bring information missing, noise and less sensitivity to the changing of the attributes towards class variables. In this paper, we use the Gaussian kernel function with smoothing parameter to estimate the density of attributes. Bayesian network classifier with continuous attributes is established by the dependency extension of Naive Bayes classifiers. We also analyze the information provided to a class for each attributes as a basis for the dependency extension of Naive Bayes classifiers. Experimental studies on UCI data sets show that Bayesian network classifiers using Gaussian kernel function provide good classification accuracy comparing to other approaches when dealing with continuous attributes.  相似文献   

10.
Generative mixture models (MMs) provide one of the most popular methodologies for unsupervised data clustering. MMs are formulated on the basis of the assumption that each observation derives from (belongs to) a single cluster. However, in many applications, data may intuitively belong to multiple classes, thus rendering the single-cluster assignment assumptions of MMs irrelevant. Furthermore, even in applications where a single-cluster data assignment is required, the induced multinomial allocation of the modeled data points to the clusters derived by a MM, imposing the constraint that the membership probabilities of a data point across clusters sum to one, makes MMs very vulnerable to the presence of outliers in the clustered data sets, and renders them ineffective in discriminating between cases of equal evidence or ignorance. To resolve these issues, in this paper we introduce a possibilistic formulation of MMs. Possibilistic clustering is a methodology that yields possibilistic data partitions, with the obtained membership values being interpreted as degrees of possibility (compatibilities) of the data points with respect to the various clusters. We provide an efficient maximum-likelihood fitting algorithm for the proposed model, and we conduct an objective evaluation of its efficacy using benchmark data.  相似文献   

11.
贝叶斯网络是概率理论与图形模式的结合,被广泛用于不确定性问题求解,但不具有处理不准确性信息的能力。可能网络是可能理论、概率理论与图形模式的结合,可弥补贝叶斯网络这方面的不足。首先介绍关于可能网络的一些概念,并与贝叶斯网进行比较,然后给出一种基于依赖分析的可能网络结构学习方法。  相似文献   

12.
《Artificial Intelligence》2001,125(1-2):209-226
Naive Bayes classifiers provide an efficient and scalable approach to supervised classification problems. When some entries in the training set are missing, methods exist to learn these classifiers under some assumptions about the pattern of missing data. Unfortunately, reliable information about the pattern of missing data may be not readily available and recent experimental results show that the enforcement of an incorrect assumption about the pattern of missing data produces a dramatic decrease in accuracy of the classifier. This paper introduces a Robust Bayes Classifier (rbc) able to handle incomplete databases with no assumption about the pattern of missing data. In order to avoid assumptions, the rbc bounds all the possible probability estimates within intervals using a specialized estimation method. These intervals are then used to classify new cases by computing intervals on the posterior probability distributions over the classes given a new case and by ranking the intervals according to some criteria. We provide two scoring methods to rank intervals and a decision theoretic approach to trade off the risk of an erroneous classification and the choice of not classifying unequivocally a case. This decision theoretic approach can also be used to assess the opportunity of adopting assumptions about the pattern of missing data. The proposed approach is evaluated on twenty publicly available databases.  相似文献   

13.
E-mail foldering or e-mail classification into user predefined folders can be viewed as a text classification/categorization problem. However, it has some intrinsic properties that make it more difficult to deal with, mainly the large cardinality of the class variable (i.e. the number of folders), the different number of e-mails per class state and the fact that this is a dynamic problem, in the sense that e-mails arrive in our mail-folders following a time-line. Perhaps because of these problems, standard text-oriented classifiers such as Naive Bayes Multinomial do no obtain a good accuracy when applied to e-mail corpora. In this paper, we identify the imbalance among classes/folders as the main problem, and propose a new method based on learning and sampling probability distributions. Our experiments over a standard corpus (ENRON) with seven datasets (e-mail users) show that the results obtained by Naive Bayes Multinomial significantly improve when applying the balancing algorithm first. For the sake of completeness in our experimental study we also compare this with another standard balancing method (SMOTE) and classifiers.  相似文献   

14.
Naive Bayes分类建立在贝叶斯理论基础上,应用极为广泛,它采用类条件独立假设对贝叶斯理论进行了近似。Bayesian Network则在这一基础上采用图形模型弥补了独立假设的不足,同时揭示出分类过程中会导致NP问题的出现。本文采用一种折衷的方法--联合关联规则与ABN分类技术构造贝叶斯分类器。它弥补了独立假设的不足,同时也避免了解决NP问题。最后,本文用实验结果展示它在多个领域远远优于Naive Bayes分类器。  相似文献   

15.
Within the framework of Bayesian networks (BNs), most classifiers assume that the variables involved are of a discrete nature, but this assumption rarely holds in real problems. Despite the loss of information discretization entails, it is a direct easy-to-use mechanism that can offer some benefits: sometimes discretization improves the run time for certain algorithms; it provides a reduction in the value set and then a reduction in the noise which might be present in the data; in other cases, there are some Bayesian methods that can only deal with discrete variables. Hence, even though there are many ways to deal with continuous variables other than discretization, it is still commonly used. This paper presents a study of the impact of using different discretization strategies on a set of representative BN classifiers, with a significant sample consisting of 26 datasets. For this comparison, we have chosen Naive Bayes (NB) together with several other semi-Naive Bayes classifiers: Tree-Augmented Naive Bayes (TAN), k-Dependence Bayesian (KDB), Aggregating One-Dependence Estimators (AODE) and Hybrid AODE (HAODE). Also, we have included an augmented Bayesian network created by using a hill climbing algorithm (BNHC). With this comparison we analyse to what extent the type of discretization method affects classifier performance in terms of accuracy and bias-variance discretization. Our main conclusion is that even if a discretization method produces different results for a particular dataset, it does not really have an effect when classifiers are being compared. That is, given a set of datasets, accuracy values might vary but the classifier ranking is generally maintained. This is a very useful outcome, assuming that the type of discretization applied is not decisive future experiments can be d times faster, d being the number of discretization methods considered.  相似文献   

16.
D.Dubois和H.Prade提出的可能性逻辑是一种基于可能性理论的非经典逻辑,主要和于不确定证据推理。可能性逻辑不同于模糊逻辑,因为模糊逻辑处理非布尔公式,其命题中包模糊谓词,而可能性逻辑处理布尔公式,其中只包含经典命题的和谓词。本文尝试在可能性理论的框架下进行不相容知识库的维护和问题求解。这里的知识表示是基于可能性逻辑的。为此,我们提出了两种不同的方法:第一种方法在计算命题可信度时,要考虑所  相似文献   

17.
基于特征加权的朴素贝叶斯分类器   总被引:13,自引:0,他引:13  
程克非  张聪 《计算机仿真》2006,23(10):92-94,150
朴素贝叶斯分类器是一种广泛使用的分类算法,其计算效率和分类效果均十分理想。但是,由于其基础假设“朴素贝叶斯假设”与现实存在一定的差异,因此在某些数据上可能导致较差的分类结果。现在存在多种方法试图通过放松朴素贝叶斯假设来增强贝叶斯分类器的分类效果,但是通常会导致计算代价大幅提高。该文利用特征加权技术来增强朴素贝叶斯分类器。特征加权参数直接从数据导出,可以看作是计算某个类别的后验概率时,某个属性对于该计算的影响程度。数值实验表明,特征加权朴素贝叶斯分类器(FWNB)的效果与其他的一些常用分类算法,例如树扩展朴素贝叶斯(TAN)和朴素贝叶斯树(NBTree)等的分类效果相当,其平均错误率都在17%左右;在计算速度上,FWNB接近于NB,比TAN和NBTree快至少一个数量级。  相似文献   

18.
Modeling uncertainty reasoning with possibilistic Petri nets   总被引:3,自引:0,他引:3  
Manipulation of perceptions is a remarkable human capability in a wide variety of physical and mental tasks under fuzzy or uncertain surroundings. Possibilistic reasoning can be treated as a mechanism that mimics human inference mechanisms with uncertain information. Petri nets are a graphical and mathematical modeling tool with powerful modeling and analytical ability. The focus of this paper is on the integration of Petri nets with possibilistic reasoning to reap the benefits of both formalisms. This integration leads to a possibilistic Petri nets model (PPN) with the following features. A possibilistic token carries information to describe an object and its corresponding possibility and necessity measures. Possibilistic transitions are classified into four types: inference transitions, duplication transitions, aggregation transitions, and aggregation-duplication transitions. A reasoning algorithm, based on possibilistic Petri nets, is also presented to improve the efficiency of possibilistic reasoning and an example related to diagnosis of cracks in reinforced concrete structures is used to illustrate the proposed approach.  相似文献   

19.
一种限定性的双层贝叶斯分类模型   总被引:29,自引:1,他引:28  
朴素贝叶斯分类模型是一种简单而有效的分类方法,但它的属性独立性假设使其无法表达属性变量间存在的依赖关系,影响了它的分类性能.通过分析贝叶斯分类模型的分类原则以及贝叶斯定理的变异形式,提出了一种基于贝叶斯定理的新的分类模型DLBAN(double-level Bayesian network augmented naive Bayes).该模型通过选择关键属性建立属性之间的依赖关系.将该分类方法与朴素贝叶斯分类器和TAN(tree augmented naive Bayes)分类器进行实验比较.实验结果表明,在大多数数据集上,DLBAN分类方法具有较高的分类正确率.  相似文献   

20.
首先,给出了基于广义可能性测度的计算树逻辑的扩展GPoCTL*、计算树逻辑的约简GPoCTL-以及带回报的计算树逻辑GPoRCTL的语构和语义。在经典互模拟和广义可能性测度的基础上讨论了广义可能性互模拟及其相关性质。最后证明了GPoCTL、GPoCTL*和GPoCTL-公式与互模拟状态之间的等价关系。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号