首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
There is a commonly held opinion that the algorithms for learning unrestricted types of Bayesian networks, especially those based on the score+search paradigm, are not suitable for building competitive Bayesian network-based classifiers. Several specialized algorithms that carry out the search into different types of directed acyclic graph (DAG) topologies have since been developed, most of these being extensions (using augmenting arcs) or modifications of the Naive Bayes basic topology. In this paper, we present a new algorithm to induce classifiers based on Bayesian networks which obtains excellent results even when standard scoring functions are used. The method performs a simple local search in a space unlike unrestricted or augmented DAGs. Our search space consists of a type of partially directed acyclic graph (PDAG) which combines two concepts of DAG equivalence: classification equivalence and independence equivalence. The results of exhaustive experimentation indicate that the proposed method can compete with state-of-the-art algorithms for classification.Editors: Pedro Larrañaga, Jose A. Lozano, Jose M. Peña and Iñaki Inza  相似文献   

2.
Node order is one of the most important factors in learning the structure of a Bayesian network (BN) for probabilistic reasoning. To improve the BN structure learning, we propose a node order learning algorithmbased on the frequently used Bayesian information criterion (BIC) score function. The algorithm dramatically reduces the space of node order and makes the results of BN learning more stable and effective. Specifically, we first find the most dependent node for each individual node, prove analytically that the dependencies are undirected, and then construct undirected subgraphs UG. Secondly, the UG- is examined and connected into a single undirected graph UGC. The relation between the subgraph number and the node number is analyzed. Thirdly, we provide the rules of orienting directions for all edges in UGC, which converts it into a directed acyclic graph (DAG). Further, we rank the DAG’s topology order and describe the BIC-based node order learning algorithm. Its complexity analysis shows that the algorithm can be conducted in linear time with respect to the number of samples, and in polynomial time with respect to the number of variables. Finally, experimental results demonstrate significant performance improvement by comparing with other methods.  相似文献   

3.
In this paper, a novel method for structure learning of a Bayesian network (BN) is developed. A new genetic approach called the matrix genetic algorithm (MGA) is proposed. In this method, an individual structure is represented as a matrix chromosome and each matrix chromosome is encoded as concatenation of upper and lower triangular parts. The two triangular parts denote the connection in the BN structure. Further, new genetic operators are developed to implement the MGA. The genetic operators are closed in the set of the directed acyclic graph (DAG). Finally, the proposed scheme is applied to real world and benchmark applications, and its effectiveness is demonstrated through computer simulation.  相似文献   

4.
《Computer Networks》2003,41(1):73-88
To provide real-time service or engineer constrained-based paths, networks require the underlying routing algorithm to be able to find low-cost paths that satisfy given quality-of-service constraints. However, the problem of constrained shortest (least-cost) path routing is known to be NP-hard, and some heuristics have been proposed to find a near-optimal solution. However, these heuristics either impose relationships among the link metrics to reduce the complexity of the problem which may limit the general applicability of the heuristic, or are too costly in terms of execution time to be applicable to large networks. In this paper, we focus on solving the delay-constrained minimum-cost path problem, and present a fast algorithm to find a near-optimal solution. This algorithm, called delay-cost-constrained routing (DCCR), is a variant of the k-shortest-path algorithm. DCCR uses a new adaptive path weight function together with an additional constraint imposed on the path cost, to restrict the search space. Thus, DCCR can return a near-optimal solution in a very short time. Furthermore, we use a variant of the Lagrangian relaxation method proposed by Handler and Zang [Networks 10 (1980) 293] to further reduce the search space by using a tighter bound on path cost. This makes our algorithm more accurate and even faster. We call this improved algorithm search space reduction + DCCR (SSR + DCCR). Through extensive simulations, we confirm that SSR + DCCR performs very well compared to the optimal but very expensive solution.  相似文献   

5.
Bayesian networks (BNs) have gained increasing attention in recent years. One key issue in Bayesian networks is parameter learning. When training data is incomplete or sparse or when multiple hidden nodes exist, learning parameters in Bayesian networks becomes extremely difficult. Under these circumstances, the learning algorithms are required to operate in a high-dimensional search space and they could easily get trapped among copious local maxima. This paper presents a learning algorithm to incorporate domain knowledge into the learning to regularize the otherwise ill-posed problem, to limit the search space, and to avoid local optima. Unlike the conventional approaches that typically exploit the quantitative domain knowledge such as prior probability distribution, our method systematically incorporates qualitative constraints on some of the parameters into the learning process. Specifically, the problem is formulated as a constrained optimization problem, where an objective function is defined as a combination of the likelihood function and penalty functions constructed from the qualitative domain knowledge. Then, a gradient-descent procedure is systematically integrated with the E-step and M-step of the EM algorithm, to estimate the parameters iteratively until it converges. The experiments with both synthetic data and real data for facial action recognition show our algorithm improves the accuracy of the learned BN parameters significantly over the conventional EM algorithm.  相似文献   

6.
Shuliang  Wang  Surapunt  Tisinee 《Applied Intelligence》2022,52(9):10202-10219

The Bayesian network (BN) is a probability inference model to describe the explicit relationship between cause and effect, which may be examined in the complex system of rice price with data uncertainty. However, discovering the optimized structure from a super-exponential number of graphs in the search space is an NP-hard problem. In this paper, Bayesian Maximal Information Coefficient (BMIC) is proposed to uncover the causal correlations from a large data set in a random system by integrating probabilistic graphical model (PGM) and maximal information coefficient (MIC) with Bayesian linear regression (BLR). First, MIC is to capture the strong dependence between predictor variables and a target variable to reduce the number of variables for the BN structural learning of PGM. Second, BLR is responsible for assigning orientation in a graph resulting from a posterior probability distribution. It conforms to what BN needs to acquire a conditional probability distribution when given the parents for each node by the Bayes’ Theorem. Third, the Bayesian information criterion (BIC) is treated as an indicator to determine the well-explained model with its data to ensure correctness. The score shows that the proposed BMIC obtains the highest score compared to the two traditional learning algorithms. Finally, the proposed BMIC is applied to discover the causal correlations from the large data set on Thai rice price by identifying the causal changes in the paddy price of Jasmine rice. The results of the experiments show that the proposed BMIC returns directional relationships with clues to identify the cause(s) and effect(s) of paddy price with a better heuristic search.

  相似文献   

7.
Computing the posterior probability distribution for a set of query variables by search result is an important task of inferences with a Bayesian network. Starting from real applications, it is also necessary to make inferences when the evidence is not contained in training data. In this paper, we are to augment the learning function to Bayesian network inferences, and extend the classical “search”-based inferences to “search + learning”-based inferences. Based on the support vector machine, we use a class of hyperplanes to construct the hypothesis space. Then we use the method of solving an optimal hyperplane to find a maximum likelihood hypothesis for the value not contained in training data. Further, we give a convergent Gibbs sampling algorithm for approximate probabilistic inference with the presence of maximum likelihood parameters. Preliminary experiments show the feasibility of our proposed methods.  相似文献   

8.
贝叶斯网结构学习是一个NP难题,提高学习效率是重要研究问题之一。贝叶斯网结构空间的规模随节点(随机变量)数呈指数增加,选择适当的结构空间可以提高学习效率。本文对贝叶斯网结构空间进行定性和定量分析,对比有向图空间、贝叶斯网空间和马尔科夫等价类空间的规模和特点。通过实验数据分析先验结构空间约束对降低结构空间规模的效率,给出约束参数的选择区间。为贝叶斯网结构学习选择搜索空间和确定约束参数提供理论支持,从而提高学习效率。  相似文献   

9.
一种快速的贝叶斯网结构学习算法   总被引:1,自引:0,他引:1  
贝叶斯网是不确定性问题知识表达和推理中最重要的一个理论模型.迄今为止人们提出了许多贝叶斯网结构学习算法,基于约束满足和评分搜索相结合的混合方法是其中的一个研究热点.以I—B&B—MDL为基础,提出了一种快速的学习算法.新算法不仅利用约束知识来压缩搜索空间,而且还用它作为启发知识来引导搜索.首先利用0阶和少量的1阶测试有效地限制搜索空间,获得网络候选的连接图,减少了独立性测试及对数据库的扫描次数,然后利用互信息作为启发性知识来引导搜索,增加了B&B搜索树的截断.在通用数据集上的实验表明:快速算法能够有效地处理大规模数据,且学习速度有较大改进.  相似文献   

10.
基于无约束优化和遗传算法,提出一种学习贝叶斯网络结构的限制型遗传算法.首先构造一无约束优化问题,其最优解对应一个无向图.在无向图的基础上,产生遗传算法的初始种群,并使用遗传算法中的选择、交叉和变异算子学习得到最优贝叶斯网络结构.由于产生初始种群的空间是由一些最优贝叶斯网络结构的候选边构成,初始种群具有很好的性质.与直接使用遗传算法学习贝叶斯网络结构的效率相比,该方法的学习效率相对较高.  相似文献   

11.
Frequent counting is a very so often required operation in machine learning algorithms. A typical machine learning task, learning the structure of Bayesian network (BN) based on metric scoring, is introduced as an example that heavily relies on frequent counting. A fast calculation method for frequent counting enhanced with two cache layers is then presented for learning BN. The main contribution of our approach is to eliminate comparison operations for frequent counting by introducing a multi-radix number system calculation. Both mathematical analysis and empirical comparison between our method and state-of-the-art solution are conducted. The results show that our method is dominantly superior to state-of-the-art solution in solving the problem of learning BN.  相似文献   

12.
朱明敏  刘三阳  汪春峰 《自动化学报》2011,37(12):1514-1519
针对小样本数据集下学习贝叶斯网络 (Bayesian networks, BN)结构的不足, 以及随着条件集的增大, 利用统计方法进行条件独立 (Conditional independence, CI) 测试不稳定等问题, 提出了一种基于先验节点序学习网络结构的优化方法. 新方法通过定义优化目标函数和可行域空间, 首次将贝叶斯网络结构学习问题转化为求解目标函数极值的数学规划问题, 并给出最优解的存在性及唯一性证明, 为贝叶斯网络的不断扩展研究提出了新的方案. 理论证明以及实验结果显示了新方法的正确性和有效性.  相似文献   

13.
In this paper, we propose a method for solving constrained optimization problems using interval analysis combined with particle swarm optimization. A set inverter via interval analysis algorithm is used to handle constraints in order to reduce constrained optimization to quasi unconstrained one. The algorithm is useful in the detection of empty search spaces, preventing useless executions of the optimization process. To improve computational efficiency, a space cleaning algorithm is used to remove solutions that are certainly not optimal. As a result, the search space becomes smaller at each step of the optimization procedure. After completing pre-processing, a modified particle swarm optimization algorithm is applied to the reduced search space to find the global optimum. The efficiency of the proposed approach is demonstrated through comprehensive experimentation involving 100 000 runs on a set of well-known benchmark constrained engineering design problems. The computational efficiency of the new method is quantified by comparing its results with other PSO variants found in the literature.  相似文献   

14.
15.
目前贝叶斯网络(Bayesian networks, BN)的传统结构学习算法在处理高维数据时呈现出计算负担过大、在合理时间内难以得到期望精度结果的问题.为了在高维数据下学习稀疏BN的最优结构, 本文提出了一种学习稀疏BN最优结构的改进K均值分块学习算法.该算法采用分而治之的策略, 首先采用互信息作为节点间距离度量, 利用融合互信息的改进K均值算法对网络分块; 其次, 使用MMPC (Max-min parent and children)算法得到整个网络的架构, 根据架构找到块间所有边的可能连接方向, 从而找到所有可能的图结构; 之后, 对所有图结构依次进行结构学习; 最终利用评分找到最优BN.实验证明, 相比现有分块结构学习算法, 本文提出的算法不仅习得了网络的精确结构, 且学习速度有一定提高; 相比非分块经典结构学习算法, 本文提出的算法在保证精度基础上, 学习速度大幅提高, 解决了非分块经典结构学习算法无法在合理时间内处理高维数据的难题.  相似文献   

16.
Bayesian Network (BN) fusion provides a precise theoretical framework for aggregating the graphical structure of a set of BNs into a consensus network. The fusion process depends on a total ordering of the variables, but both the problem of searching for an optimal consensus structure (according to the standard problem definition) as well as the one of looking for the optimal ordering are NP-hard.In this paper we start with this theoretical framework and extend it from a practical point of view. The two main methodological contributions we make are: (1) an adaptation of the well-known BN learning algorithm GES (Greedy Equivalence Search) to the case of having a set of BNs as input instead of data (we prove the correctness of the adapted algorithm, i.e., under certain standard assumptions the optimal solution for the BN fusion process is obtained); and (2) a heuristic method for identifying a suitable order of the variables, which allows us to obtain consensus BNs having (far) fewer edges than those obtained by using random orderings.Finally, we design several computational experiments to test our proposals. From the results, we can conclude that the consensus network can be efficiently obtained by using the proposed heuristic to compute the total order of the variables, with this DAG being very close to the optimal one.  相似文献   

17.
This paper considers a single-machine scheduling problem with power-down mechanism to minimize both total energy consumption and maximum tardiness. The aim is to find an optimal processing sequence of jobs and determine if the machine should be executed a power-down operation between two consecutive jobs. To formulate the problem, a mixed-integer linear programming (MILP) model is developed. Then a basic ε  constraint method is proposed to obtain the complete Pareto front of the problem. Considering the particularity of the problem, we also develop local search, preprocessing technique and valid inequalities to strengthen the basic ε  constraint method. Finally, to obtain approximate Pareto fronts for large-size problems, we utilize the method of cluster analysis to divide the jobs into several sorted clusters according to their release times and due dates. Any job in a preceding cluster must be processed before all jobs in a subsequent cluster. Thus, the solution space is reduced significantly. Computational experiments on benchmark and randomly generated instances demonstrate the effectiveness of the proposed exact and approximation approaches.  相似文献   

18.
Bayesian networks (BNs) are knowledge representation tools capable of representing dependence or independence relationships among random variables. Learning the structure of BNs from datasets has received increasing attention in the last two decades, due to the BNs' capacity of providing good inference models and discovering the structure of complex domains. One approach for BNs' structure learning from data is to define a scoring metric that evaluates the quality of the candidate networks, given a dataset, and then apply an optimization procedure to explore the set of candidate networks. Among the most frequently used optimization methods for BN score-based learning is greedy hill climbing (GHC) search. This paper proposes a new local discovery ant colony optimization (ACO) algorithm and a hybrid algorithm max-min ant colony optimization (MMACO), based on the local discovery algorithm max-min parents and children (MMPC) and ACO to learn the structure of a BN. In MMACO, MMPC is used to construct the skeleton of the BN and ACO is used to orientate the skeleton edges, thus returning the final structure. The algorithms are applied to several sets of benchmark networks and are shown to outperform the GHC and simulated annealing algorithms.   相似文献   

19.
We introduce a GPU-based parallel vertex substitution (pVS) algorithm for the p-median problem using the CUDA architecture by NVIDIA. pVS is developed based on the best profit search algorithm, an implementation of vertex substitution (VS), that is shown to produce reliable solutions for p-median problems. In our approach, each candidate solution in the entire search space is allocated to a separate thread, rather than dividing the search space into parallel subsets. This strategy maximizes the usage of GPU parallel architecture and results in a significant speedup and robust solution quality. Computationally, pVS reduces the worst case complexity from sequential VS’s O(p · n2) to O(p · (n ? p)) on each thread by parallelizing computational tasks on GPU implementation. We tested the performance of pVS on two sets of numerous test cases (including 40 network instances from OR-lib) and compared the results against a CPU-based sequential VS implementation. Our results show that pVS achieved a speed gain ranging from 10 to 57 times over the traditional VS in all test network instances.  相似文献   

20.
Within probabilistic classification problems, learning the Markov boundary of the class variable consists in the optimal approach for feature subset selection. In this paper we propose two algorithms that learn the Markov boundary of a selected variable. These algorithms are based on the score+search paradigm for learning Bayesian networks. Both algorithms use standard scoring functions but they perform the search in constrained spaces of class-focused directed acyclic graphs, going through the space by means of operators adapted for the problem. The algorithms have been validated experimentally by using a wide spectrum of databases, and their results show a performance competitive with the state-of-the-art.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号