首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
传统多维贝叶斯网络分类器(MBNC)限制其模型结构必须是二分的,通过移除该限制可得到更准确的对关联分布建模的通用MBNC(GMBNC)。基于局部马尔可夫毯的迭代搜索,提出可准确学习GMBNC的算法IPC-GMBNC。该算法由于无需学习全局贝叶斯网络(BN),可扩展性强。基于已知贝叶斯网络模型而随机生成的数据上所执行的实验显示,IPC-GMBNC可有效推导出目标结构;而且与传统的全局结构学习算法PC相比,IPC-GMBNC可节省大量的计算量。  相似文献   

2.
为了提高贝叶斯分类器的分类性能,针对贝叶斯网络分类器的构成特征,提出一种基于参数集成的贝叶斯分类器判别式参数学习算法PEBNC。该算法将贝叶斯分类器的参数学习视为回归问题,将加法回归模型应用于贝叶斯网络分类器的参数学习,实现贝叶斯分类器的判别式参数学习。实验结果表明,在大多数实验数据上,PEBNC能够明显提高贝叶斯分类器的分类准确率。此外,与一般的贝叶斯集成分类器相比,PEBNC不必存储成员分类器的参数,空间复杂度大大降低。  相似文献   

3.
Boosted Bayesian network classifiers   总被引:2,自引:0,他引:2  
The use of Bayesian networks for classification problems has received a significant amount of recent attention. Although computationally efficient, the standard maximum likelihood learning method tends to be suboptimal due to the mismatch between its optimization criteria (data likelihood) and the actual goal of classification (label prediction accuracy). Recent approaches to optimizing classification performance during parameter or structure learning show promise, but lack the favorable computational properties of maximum likelihood learning. In this paper we present boosted Bayesian network classifiers, a framework to combine discriminative data-weighting with generative training of intermediate models. We show that boosted Bayesian network classifiers encompass the basic generative models in isolation, but improve their classification performance when the model structure is suboptimal. We also demonstrate that structure learning is beneficial in the construction of boosted Bayesian network classifiers. On a large suite of benchmark data-sets, this approach outperforms generative graphical models such as naive Bayes and TAN in classification accuracy. Boosted Bayesian network classifiers have comparable or better performance in comparison to other discriminatively trained graphical models including ELR and BNC. Furthermore, boosted Bayesian networks require significantly less training time than the ELR and BNC algorithms.  相似文献   

4.
We present a maximum margin parameter learning algorithm for Bayesian network classifiers using a conjugate gradient (CG) method for optimization. In contrast to previous approaches, we maintain the normalization constraints on the parameters of the Bayesian network during optimization, i.e., the probabilistic interpretation of the model is not lost. This enables us to handle missing features in discriminatively optimized Bayesian networks. In experiments, we compare the classification performance of maximum margin parameter learning to conditional likelihood and maximum likelihood learning approaches. Discriminative parameter learning significantly outperforms generative maximum likelihood estimation for naive Bayes and tree augmented naive Bayes structures on all considered data sets. Furthermore, maximizing the margin dominates the conditional likelihood approach in terms of classification performance in most cases. We provide results for a recently proposed maximum margin optimization approach based on convex relaxation. While the classification results are highly similar, our CG-based optimization is computationally up to orders of magnitude faster. Margin-optimized Bayesian network classifiers achieve classification performance comparable to support vector machines (SVMs) using fewer parameters. Moreover, we show that unanticipated missing feature values during classification can be easily processed by discriminatively optimized Bayesian network classifiers, a case where discriminative classifiers usually require mechanisms to complete unknown feature values in the data first.  相似文献   

5.
In the information retrieval framework, there are problems where the goal is to recover objects of a particular class from big sets of unlabelled objects. In some of these problems, only examples from the class we want to recover are available. For such problems, the machine learning community has developed algorithms that are able to learn binary classifiers in the absence of negative examples. Among them, we can find the positive Bayesian network classifiers, algorithms that induce Bayesian network classifiers from positive and unlabelled examples. The main drawback of these algorithms is that they require some previous knowledge about the a priori probability distribution of the class. In this paper, we propose a wrapper approach to tackle the learning when no such information is available, setting this probability at the optimal value in terms of the recovery of positive examples. The evaluation of classifiers in positive unlabelled learning problems is a non-trivial question. We have also worked on this problem, and we have proposed a new guiding metric to be used in the search for the optimal a priori probability of the positive class that we have called the pseudo F. We have empirically tested the proposed metric and the wrapper classifiers on both synthetic and real-life datasets. The results obtained in this empirical comparison show that the wrapper Bayesian network classifiers provide competitive results, particularly when the actual a priori probability of the positive class is high.  相似文献   

6.
In this paper, we propose a more efficient Bayesian network structure learning algorithm under the framework of score based local learning (SLL). Our algorithm significantly improves computational efficiency by restricting the neighbors of each variable to a small subset of candidates and storing necessary information to uncover the spouses, at the same time guaranteeing to find the optimal neighbor set in the same sense as SLL. The algorithm is theoretically sound in the sense that it is optimal in the limit of large sample size. Empirical results testify its improved speed without loss of quality in the learned structures.  相似文献   

7.
A large number of distance metrics have been proposed to measure the difference of two instances. Among these metrics, Short and Fukunaga metric (SFM) and minimum risk metric (MRM) are two probability-based metrics which are widely used to find reasonable distance between each pair of instances with nominal attributes only. For simplicity, existing works use naive Bayesian (NB) classifiers to estimate class membership probabilities in SFM and MRM. However, it has been proved that the ability of NB classifiers to class probability estimation is poor. In order to scale up the classification performance of NB classifiers, many augmented NB classifiers are proposed. In this paper, we study the class probability estimation performance of these augmented NB classifiers and then use them to estimate the class membership probabilities in SFM and MRM. The experimental results based on a large number of University of California, Irvine (UCI) data-sets show that using these augmented NB classifiers to estimate the class membership probabilities in SFM and MRM can significantly enhance their generalisation ability.  相似文献   

8.
Learning Bayesian networks from scarce data is a major challenge in real-world applications where data are hard to acquire. Transfer learning techniques attempt to address this by leveraging data from different but related problems. For example, it may be possible to exploit medical diagnosis data from a different country. A challenge with this approach is heterogeneous relatedness to the target, both within and across source networks. In this paper we introduce the Bayesian network parameter transfer learning (BNPTL) algorithm to reason about both network and fragment (sub-graph) relatedness. BNPTL addresses (i) how to find the most relevant source network and network fragments to transfer, and (ii) how to fuse source and target parameters in a robust way. In addition to improving target task performance, explicit reasoning allows us to diagnose network and fragment relatedness across Bayesian networks, even if latent variables are present, or if their state space is heterogeneous. This is important in some applications where relatedness itself is an output of interest. Experimental results demonstrate the superiority of BNPTL at various scarcities and source relevance levels compared to single task learning and other state-of-the-art parameter transfer methods. Moreover, we demonstrate successful application to real-world medical case studies.  相似文献   

9.
In this paper, we describe three Bayesian classifiers for mineral potential mapping: (a) a naive Bayesian classifier that assumes complete conditional independence of input predictor patterns, (b) an augmented naive Bayesian classifier that recognizes and accounts for conditional dependencies amongst input predictor patterns and (c) a selective naive classifier that uses only conditionally independent predictor patterns. We also describe methods for training the classifiers, which involves determining dependencies amongst predictor patterns and estimating conditional probability of each predictor pattern given the target deposit-type. The output of a trained classifier determines the extent to which an input feature vector belongs to either the mineralized class or the barren class and can be mapped to generate a favorability map. The procedures are demonstrated by an application to base metal potential mapping in the proterozoic Aravalli Province (western India). The results indicate that although the naive Bayesian classifier performs well and shows significant tolerance for the violation of the conditional independence assumption, the augmented naive Bayesian classifier performs better and exhibits finer generalization capability. The results also indicate that the rejection of conditionally dependent predictor patterns degrades the performance of a naive classifier.  相似文献   

10.
This paper deals with a classification problem known as learning from label proportions. The provided dataset is composed of unlabeled instances and is divided into disjoint groups. General class information is given within the groups: the proportion of instances of the group that belong to each class.We have developed a method based on the Structural EM strategy that learns Bayesian network classifiers to deal with the exposed problem. Four versions of our proposal are evaluated on synthetic data, and compared with state-of-the-art approaches on real datasets from public repositories. The results obtained show a competitive behavior for the proposed algorithm.  相似文献   

11.
Bayesian networks, which have a solid mathematical basis as classifiers, take the prior information of samples into consideration. They have gained considerable popularity for solving classification problems. However, many real-world applications can be viewed as classification problems in which instances have to be assigned to a set of different classes at the same time. To address this problem, multi-dimensional Bayesian network classifiers (MBCs), which organize class and feature variables as three subgraphs, have recently been proposed. Because each subgraph has different structural restrictions, three different learning algorithms are needed. In this paper, we present for the first time an MBC learning algorithm based on an optimization model (MBC-OM) that is inspired by the constraint-based Bayesian network structure learning method. MBC-OM uses the chi-squared statistic and mutual information to estimate the dependence coefficients among variables, and these are used to construct an objective function as an overall measure of the dependence for a classifier structure. Therefore, the problem of searching for an optimal classifier becomes one of finding the maximum value of the objective function in feasible fields. We prove the existence and uniqueness of the numerical solution. Moreover, we validate our method on five benchmark data sets. Experimental results are competitive, and outperform state-of-the-art algorithms for multi-dimensional classification.  相似文献   

12.
For learning a Bayesian network classifier, continuous attributes usually need to be discretized. But the discretization of continuous attributes may bring information missing, noise and less sensitivity to the changing of the attributes towards class variables. In this paper, we use the Gaussian kernel function with smoothing parameter to estimate the density of attributes. Bayesian network classifier with continuous attributes is established by the dependency extension of Naive Bayes classifiers. We also analyze the information provided to a class for each attributes as a basis for the dependency extension of Naive Bayes classifiers. Experimental studies on UCI data sets show that Bayesian network classifiers using Gaussian kernel function provide good classification accuracy comparing to other approaches when dealing with continuous attributes.  相似文献   

13.
针对装备在不同配置及使用环境的条件下运行的故障率等级差异,详细介绍并分析了现有各贝叶斯分类器的特点和构建算法。在此基础上,提出了基于贝叶斯网络的产品故障分类模型建模方法用于指导实际分类任务的模型建立和应用。通过法国某装备生产企业的实例分析,实验结果证明在所有的贝叶斯网络分类器及传统的决策树C4.5分类器中,树型朴素贝叶斯分类器能够取得最好的分类效果,并为后续的维修资源配置及产品运行能力优化提供有效的理论支持。  相似文献   

14.
Bayesian networks are one of the most powerful tools in the design of expert systems located in an uncertainty framework. However, normally their application is determined by the discretization of the continuous variables. In this paper the naïve Bayes (NB) and tree augmented naïve Bayes (TAN) models are developed. They are based on Mixtures of Truncated Exponentials (MTE) designed to deal with discrete and continuous variables in the same network simultaneously without any restriction. The aim is to characterize the habitat of the spur-thighed tortoise (Testudo graeca graeca), using several continuous environmental variables, and one discrete (binary) variable representing the presence or absence of the tortoise. These models are compared with the full discrete models and the results show a better classification rate for the continuous one. Therefore, the application of continuous models instead of discrete ones avoids loss of statistical information due to the discretization. Moreover, the results of the TAN continuous model show a more spatially accurate distribution of the tortoise. The species is located in the Doñana Natural Park, and in semiarid habitats. The proposed continuous models based on MTEs are valid for the study of species predictive distribution modelling.  相似文献   

15.
16.
The main purpose of a gene interaction network is to map the relationships of the genes that are out of sight when a genomic study is tackled. DNA microarrays allow the measure of gene expression of thousands of genes at the same time. These data constitute the numeric seed for the induction of the gene networks. In this paper, we propose a new approach to build gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling. The interactions induced by the Bayesian classifiers are based both on the expression levels and on the phenotype information of the supervised variable. Feature selection and bootstrap resampling add reliability and robustness to the overall process removing the false positive findings. The consensus among all the induced models produces a hierarchy of dependences and, thus, of variables. Biologists can define the depth level of the model hierarchy so the set of interactions and genes involved can vary from a sparse to a dense set. Experimental results show how these networks perform well on classification tasks. The biological validation matches previous biological findings and opens new hypothesis for future studies.  相似文献   

17.
We propose a novel discriminative learning approach for Bayesian pattern classification, called ‘constrained maximum margin (CMM)’. We define the margin between two classes as the difference between the minimum decision value for positive samples and the maximum decision value for negative samples. The learning problem is to maximize the margin under the constraint that each training pattern is classified correctly. This nonlinear programming problem is solved using the sequential unconstrained minimization technique. We applied the proposed CMM approach to learn Bayesian classifiers based on Gaussian mixture models, and conducted the experiments on 10 UCI datasets. The performance of our approach was compared with those of the expectation-maximization algorithm, the support vector machine, and other state-of-the-art approaches. The experimental results demonstrated the effectiveness of our approach.  相似文献   

18.
This paper proposes an approach that detects surface defects with three-dimensional characteristics on scale-covered steel blocks. The surface reflection properties of the flawless surface changes strongly. Light sectioning is used to acquire the surface range data of the steel block. These sections are arbitrarily located within a range of a few millimeters due to vibrations of the steel block on the conveyor. After the recovery of the depth map, segments of the surface are classified according to a set of extracted features by means of Bayesian network classifiers. For establishing the structure of the Bayesian network, a floating search algorithm is applied, which achieves a good tradeoff between classification performance and computational efficiency for structure learning. This search algorithm enables conditional exclusions of previously added attributes and/or arcs from the network. The experiments show that the selective unrestricted Bayesian network classifier outperforms the naïve Bayes and the tree-augmented naïve Bayes decision rules concerning the classification rate. More than 98% of the surface segments have been classified correctly.  相似文献   

19.
Bayesian model averaging (BMA) can resolve the overfitting problem by explicitly incorporating the model uncertainty into the analysis procedure. Hence, it can be used to improve the generalization performance of Bayesian network classifiers. Until now, BMA of Bayesian network classifiers has only been performed in some restricted forms, e.g., the model is averaged given a single node-order, because of its heavy computational burden. However, it can be hard to obtain a good node-order when the available training dataset is sparse. To alleviate this problem, we propose BMA of Bayesian network classifiers over several distinct node-orders obtained using the Markov chain Monte Carlo sampling technique. The proposed method was examined using two synthetic problems and four real-life datasets. First, we show that the proposed method is especially effective when the given dataset is very sparse. The classification accuracy of averaging over multiple node-orders was higher in most cases than that achieved using a single node-order in our experiments. We also present experimental results for test datasets with unobserved variables, where the quality of the averaged node-order is more important. Through these experiments, we show that the difference in classification performance between the cases of multiple node-orders and single node-order is related to the level of noise, confirming the relative benefit of averaging over multiple node-orders for incomplete data. We conclude that BMA of Bayesian network classifiers over multiple node-orders has an apparent advantage when the given dataset is sparse and noisy, despite the method's heavy computational cost.  相似文献   

20.
新的贝叶斯网络结构学习方法   总被引:3,自引:0,他引:3  
贝叶斯网络是一种将贝叶斯概率方法和有向无环图的网络拓扑结构有机结合的表示模型,它描述了数据项及数据项之间的非线性依赖关系.报告了贝叶斯网络研究的现状,并针对传统算法需要主观规定网络中结点顺序的缺点,提出了一个新的可以在无约束条件下,根据观测得到的训练样本集的概率关系,自动完成学习贝叶斯网络结构的新方法.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号