首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
《Knowledge》1999,12(3):101-111
Causal networks require probability values to be supplied for all possible combinations of outcomes in the cause–effect relationships implied by the network. Only then is it possible to use the existing methods for updating the information in the network to reflect new knowledge gained in a specific situation. Supplying causal information which is complete and accurate is not always possible in many applications, for example Decision Support Systems. This requirement becomes even more difficult to achieve when a single event is influenced by a large number of other events.Maximum Entropy can be used to find minimally prejudiced estimates for missing information but this approach is, in general, computationally infeasible.However, the authors have already shown that for certain special cases of causal networks such estimates can, in fact, be found in linear time.This article extends the work to causal inverted multiway trees in which any event can be influenced by any number of other events but itself only influences at most one event. In order to achieve this extension a thorough analysis of the traditional Bayesian model is undertaken to identify the large number of constraints which a valid Maximum Entropy model must satisfy. A simplified Maximum Entropy model is proposed and formal proofs that this satisfies the Bayesian properties are given.Equating the joint event probability distributions given by the Bayesian and Maximum Entropy models enables the Lagrange multipliers of the latter to be determined. This leads to an iterative tree traversal algorithm which converges to the minimally prejudiced estimates for the missing information. When this information is added to that already provided, any existing method for updating the causal network can be utilised.  相似文献   

2.
Causal independence modelling is a well-known method for reducing the size of probability tables, simplifying the probabilistic inference and explaining the underlying mechanisms in Bayesian networks. Recently, a generalization of the widely-used noisy OR and noisy AND models, causal independence models based on symmetric Boolean functions, was proposed. In this paper, we study the problem of learning the parameters in these models, further referred to as symmetric causal independence models. We present a computationally efficient EM algorithm to learn parameters in symmetric causal independence models, where the computational scheme of the Poisson binomial distribution is used to compute the conditional probabilities in the E-step. We study computational complexity and convergence of the developed algorithm. The presented EM algorithm allows us to assess the practical usefulness of symmetric causal independence models. In the assessment, the models are applied to a classification task; they perform competitively with state-of-the-art classifiers.  相似文献   

3.
贝叶斯缺陷分析模型及其在软件测试中的应用   总被引:1,自引:0,他引:1  
针对面向对象软件提出了一种以贝叶斯网络理论为基础的软件缺陷分析模型,通过分析系统中存在缺陷对象之间的影响关系构建贝叶斯网络模型,利用已有的经验数据评估贝叶斯网络模型中各节点的缺陷概率分布,并与软件测试过程相结合,直接从测试设计级别为测试人员提供相关决策支持。将该模型应用到实际的项目中,取得了较好的效果。  相似文献   

4.
A tight bound on concept learning   总被引:1,自引:0,他引:1  
A tight bound on the generalization performance of concept learning is shown by a novel approach. Unlike existing theories, the new approach uses no assumption on large sample size as in Bayesian approach and does not consider the uniform learnability as in the VC dimension analysis. We analyze the generalization performance of some particular learning algorithm that is not necessarily well behaved, in the hope that once learning curves or sample complexity of this algorithm is obtained, it is applicable to real learning situations. The result is expressed in a dimension called Boolean interpolation dimension, and is tight in the sense that it meets the lower bound requirement of Baum and Haussler (1989). The Boolean interpolation dimension is not greater than the number of modifiable system parameters, and definable for almost all the real-world networks such as backpropagation networks and linear threshold multilayer networks. It is shown that the generalization error follows from a beta distribution of parameters m, the number of training examples, and d, the Boolean interpolation dimension. This implies that for large d, the learning results tend to the average-case result, known as the self-averaging property of the learning. The bound is shown to be applicable to the practical learning algorithms that can be modeled by the Gibbs algorithm with a uniform prior. The result is also extended to the case of inconsistent learning.  相似文献   

5.
In designing a Bayesian network for an actual problem, developers need to bridge the gap between the mathematical abstractions offered by the Bayesian-network formalism and the features of the problem to be modelled. Qualitative probabilistic networks (QPNs) have been put forward as qualitative analogues to Bayesian networks, and allow modelling interactions in terms of qualitative signs. They thus have the advantage that developers can abstract from the numerical detail, and therefore the gap may not be as wide as for their quantitative counterparts. A notion that has been suggested in the literature to facilitate Bayesian-network development is causal independence. It allows exploiting compact representations of probabilistic interactions among variables in a network. In the paper, we deploy both causal independence and QPNs in developing and analysing a collection of qualitative, causal interaction patterns, called QC patterns. These are endowed with a fixed qualitative semantics, and are intended to offer developers a high-level starting point when developing Bayesian networks.  相似文献   

6.
Conditional probability tables (CPT) in many Bayesian networks often contain missing values. The problem of missing values in CPT is a very common problem and occurs due to the lack of data on certain scenarios that are observed in the real world but are missing in the training data. The current approaches of addressing the problem of missing values in CPT are very restrictive in that they assume certain probability distributions for estimating missing values. Recently, maximum entropy (ME) approaches have been used to learn features of probability distribution functions from the observed data. The ME approaches do not require any data distribution assumptions and are shown to work well for several non-parametric distributions. The ME and least square (LS) error minimizing approaches can be used for estimating missing values in CPT for Bayesian networks. The applications of ME and LS approaches for estimating missing CPT require researchers to solve difficult constrained non-linear optimization problems. These difficult constrained non-linear optimization problems can be solved using genetic algorithms.  相似文献   

7.
Shuliang  Wang  Surapunt  Tisinee 《Applied Intelligence》2022,52(9):10202-10219

The Bayesian network (BN) is a probability inference model to describe the explicit relationship between cause and effect, which may be examined in the complex system of rice price with data uncertainty. However, discovering the optimized structure from a super-exponential number of graphs in the search space is an NP-hard problem. In this paper, Bayesian Maximal Information Coefficient (BMIC) is proposed to uncover the causal correlations from a large data set in a random system by integrating probabilistic graphical model (PGM) and maximal information coefficient (MIC) with Bayesian linear regression (BLR). First, MIC is to capture the strong dependence between predictor variables and a target variable to reduce the number of variables for the BN structural learning of PGM. Second, BLR is responsible for assigning orientation in a graph resulting from a posterior probability distribution. It conforms to what BN needs to acquire a conditional probability distribution when given the parents for each node by the Bayes’ Theorem. Third, the Bayesian information criterion (BIC) is treated as an indicator to determine the well-explained model with its data to ensure correctness. The score shows that the proposed BMIC obtains the highest score compared to the two traditional learning algorithms. Finally, the proposed BMIC is applied to discover the causal correlations from the large data set on Thai rice price by identifying the causal changes in the paddy price of Jasmine rice. The results of the experiments show that the proposed BMIC returns directional relationships with clues to identify the cause(s) and effect(s) of paddy price with a better heuristic search.

  相似文献   

8.
In this article we describe an important structure used to model causal theories and a related problem of great interest to semi-empirical scientists. A causal Bayesian network is a pair consisting of a directed acyclic graph (called a causal graph) that represents causal relationships and a set of probability tables, that together with the graph specify the joint probability of the variables represented as nodes in the graph. We briefly describe the probabilistic semantics of causality proposed by Pearl for this graphical probabilistic model, and how unobservable variables greatly complicate models and their application. A common question about causal Bayesian networks is the problem of identifying causal effects from nonexperimental data, which is called the identifability problem. In the basic version of this problem, a semi-empirical scientist postulates a set of causal mechanisms and uses them, together with a probability distribution on the observable set of variables in a domain of interest, to predict the effect of a manipulation on some variable of interest. We explain this problem, provide several examples, and direct the readers to recent work that provides a solution to the problem and some of its extensions. We assume that the Bayesian network structure is given to us and do not address the problem of learning it from data and the related statistical inference and testing issues.  相似文献   

9.
Xintao  Yong   《Pattern recognition》2006,39(12):2439-2449
DNA microarray provides a powerful basis for analysis of gene expression. Bayesian networks, which are based on directed acyclic graphs (DAGs) and can provide models of causal influence, have been investigated for gene regulatory networks. The difficulty with this technique is that learning the Bayesian network structure is an NP-hard problem, as the number of DAGs is superexponential in the number of genes, and an exhaustive search is intractable. In this paper, we propose an enhanced constraint-based approach for causal structure learning. We integrate with graphical Gaussian modeling and use its independence graph as an input of our constraint-based causal learning method. We also present graphical decomposition techniques to further improve the performance. Our enhanced method makes it feasible to explore causal interactions among genes interactively. We have tested our methodology using two microarray data sets. The results show that the technique is both effective and efficient in exploring causal structures from microarray data.  相似文献   

10.
In this paper we consider the determination of the structure of the high-order Boltzmann machine (HOBM), a stochastic recurrent network for approximating probability distributions. We obtain the structure of the HOBM, the hypergraph of connections, from conditional independences of the probability distribution to model. We assume that an expert provides these conditional independences and from them we build independence maps, Markov and Bayesian networks, which represent conditional independences through undirected graphs and directed acyclic graphs respectively. From these independence maps we construct the HOBM hypergraph. The central aim of this paper is to obtain a minimal hypergraph. Given that different orderings of the variables provide in general different Bayesian networks, we define their intersection hypergraph. We prove that the intersection hypergraph of all the Bayesian networks (N!) of the distribution is contained by the hypergraph of the Markov network, it is more simple, and we give a procedure to determine a subset of the Bayesian networks that verifies this property. We also prove that the Markov network graph establishes a minimum connectivity for the hypergraphs from Bayesian networks.  相似文献   

11.
Maung and Paris [Internat J Intell Syst 1990, 5(5), 595–603] have shown that, in the general case, solving causal networks using maximum entropy techniques is NP complete. This paper considers multivalued causal inverted multiway trees, a nontrivial class of causal networks, in which any event can be influenced by any number of other events but itself only influences at most one event. We show that for this class of causal networks, maximum entropy can be used to find minimally prejudiced estimates for missing information. The techniques required for the current problem are substantially different from those used in the case of causal multiway trees in that nonlinear constraints arising from independence have to be incorporated. In addition, a new algebraic method is presented which isolates an unknown Lagrange multiplier by using the quotient of two pairs of state probabilities. Equating the joint probability distributions given by the Bayesian and maximum entropy models enables the Lagrange multipliers of the latter to be determined. An efficient iterative tree traversal algorithm which converges to the minimally prejudiced estimates for the missing information is described. When this information is added to that already provided, any existing method for updating the causal network can be used. ©1999 John Wiley & Sons, Inc.  相似文献   

12.
贝叶斯网络是用来表示变量集合概率分布的图形模式,它提供了一种方便地表示概率信息的方法,它可以表示因果关系,但并不局限于因果关系。贝叶斯网对不确定性问题有很强的推理能力,近几年来受到众多研究者的重视。贝叶斯网络中弧的定向是指在已经有了变量之间的依赖关系图的条件下确定变量之间的边的方向的过程。介绍了一种改进了贝叶斯网弧定向的方法,该方法结合了目前多种定向方法的优点,实验证明该算法优于已存在的弧定向方法。  相似文献   

13.
Bayesian wavelet networks for nonparametric regression   总被引:2,自引:0,他引:2  
Radial wavelet networks have been proposed previously as a method for nonparametric regression. We analyze their performance within a Bayesian framework. We derive probability distributions over both the dimension of the networks and the network coefficients by placing a prior on the degrees of freedom of the model. This process bypasses the need to test or select a finite number of networks during the modeling process. Predictions are formed by mixing over many models of varying dimension and parameterization. We show that the complexity of the models adapts to the complexity of the data and produces good results on a number of benchmark test series.  相似文献   

14.
A Bayesian network is a probabilistic representation for uncertain relationships, which has proven to be useful for modeling real-world problems. When there are many potential causes of a given effect, however, both probability assessment and inference using a Bayesian network can be difficult. In this paper, we describe causal independence, a collection of conditional independence assertions and functional relationships that are often appropriate to apply to the representation of the uncertain interactions between causes and effect. We show how the use of causal independence in a Bayesian network can greatly simplify probability assessment as well as probabilistic inference  相似文献   

15.
A Bayesian approach to estimate selection probabilities of probabilistic Boolean networks is developed in this study. The concepts of inverse Boolean function and updatable set are introduced to specify states which can be used to update a Bayesian posterior distribution. The analysis on convergence of the posteriors is carried out by exploiting the combination of semi‐tensor product technique and state decomposition algorithm for Markov chain. Finally, some numerical examples demonstrate the proposed estimation algorithm.  相似文献   

16.
Sampling is a fundamental method for generating data subsets. As many data analysis methods are deve-loped based on probability distributions, maintaining distributions when sampling can help to ensure good data analysis performance. However, sampling a minimum subset while maintaining probability distributions is still a problem. In this paper, we decompose a joint probability distribution into a product of conditional probabilities based on Bayesian networks and use the chi-square test to formulate a sampling problem that requires that the sampled subset pass the distribution test to ensure the distribution. Furthermore, a heuristic sampling algorithm is proposed to generate the required subset by designing two scoring functions: one based on the chi-square test and the other based on likelihood functions. Experiments on four types of datasets with a size of 60000 show that when the significant difference level,α, is set to 0.05, the algorithm can exclude 99.9%, 99.0%, 93.1% and 96.7% of the samples based on their Bayesian networks—ASIA, ALARM, HEPAR2, and ANDES, respectively. When subsets of the same size are sampled, the subset generated by our algorithm passes all the distribution tests and the average distribution difference is approximately 0.03; by contrast, the subsets generated by random sampling pass only 83.8%of the tests, and the average distribution difference is approximately 0.24.  相似文献   

17.
18.
Dependency networks approximate a joint probability distribution over multiple random variables as a product of conditional distributions. Relational Dependency Networks (RDNs) are graphical models that extend dependency networks to relational domains. This higher expressivity, however, comes at the expense of a more complex model-selection problem: an unbounded number of relational abstraction levels might need to be explored. Whereas current learning approaches for RDNs learn a single probability tree per random variable, we propose to turn the problem into a series of relational function-approximation problems using gradient-based boosting. In doing so, one can easily induce highly complex features over several iterations and in turn estimate quickly a very expressive model. Our experimental results in several different data sets show that this boosting method results in efficient learning of RDNs when compared to state-of-the-art statistical relational learning approaches.  相似文献   

19.
针对民航突发事件因果关系无法有效评估与关联分析的问题,提出了一种基于贝叶斯网络的民航突发事件因果关系分析方法。在民航突发事件应急管理领域本体的基础上引入贝叶斯理论,首先通过规则设计实现了领域本体中概念、关系与实例的贝叶斯网络转换,然后采用贝叶斯网络知识合成算法E-IPFP构建贝叶斯网络节点的条件概率表,并通过消息传递机制计算父子节点间的概率关系,获得民航突发事件因果关系的概率分布。采用民航突发事件应急管理领域本体和世界民航事故调查跟踪报告中的案例作为实验数据,给出了民航突发事件因果间关系的分析,为基于大数据的突发事件关联分析与推理提供了方法支持。  相似文献   

20.
Bayesian networks are graphical models that describe dependency relationships between variables, and are powerful tools for studying probability classifiers. At present, the causal Bayesian network learning method is used in constructing Bayesian network classifiers while the contribution of attribute to class is over-looked. In this paper, a Bayesian network specifically for classification-restricted Bayesian classification networks is proposed. Combining dependency analysis between variables, classification accuracy evaluation criteria and a search algorithm, a learning method for restricted Bayesian classification networks is presented. Experiments and analysis are done using data sets from UCI machine learning repository. The results show that the restricted Bayesian classification network is more accurate than other well-known classifiers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号