共查询到20条相似文献,搜索用时 15 毫秒
1.
Boosted Bayesian network classifiers 总被引:2,自引:0,他引:2
The use of Bayesian networks for classification problems has received a significant amount of recent attention. Although computationally
efficient, the standard maximum likelihood learning method tends to be suboptimal due to the mismatch between its optimization
criteria (data likelihood) and the actual goal of classification (label prediction accuracy). Recent approaches to optimizing
classification performance during parameter or structure learning show promise, but lack the favorable computational properties
of maximum likelihood learning. In this paper we present boosted Bayesian network classifiers, a framework to combine discriminative
data-weighting with generative training of intermediate models. We show that boosted Bayesian network classifiers encompass
the basic generative models in isolation, but improve their classification performance when the model structure is suboptimal.
We also demonstrate that structure learning is beneficial in the construction of boosted Bayesian network classifiers. On
a large suite of benchmark data-sets, this approach outperforms generative graphical models such as naive Bayes and TAN in
classification accuracy. Boosted Bayesian network classifiers have comparable or better performance in comparison to other
discriminatively trained graphical models including ELR and BNC. Furthermore, boosted Bayesian networks require significantly
less training time than the ELR and BNC algorithms. 相似文献
2.
Chaoqun Li 《人工智能实验与理论杂志》2013,25(4):477-491
A large number of distance metrics have been proposed to measure the difference of two instances. Among these metrics, Short and Fukunaga metric (SFM) and minimum risk metric (MRM) are two probability-based metrics which are widely used to find reasonable distance between each pair of instances with nominal attributes only. For simplicity, existing works use naive Bayesian (NB) classifiers to estimate class membership probabilities in SFM and MRM. However, it has been proved that the ability of NB classifiers to class probability estimation is poor. In order to scale up the classification performance of NB classifiers, many augmented NB classifiers are proposed. In this paper, we study the class probability estimation performance of these augmented NB classifiers and then use them to estimate the class membership probabilities in SFM and MRM. The experimental results based on a large number of University of California, Irvine (UCI) data-sets show that using these augmented NB classifiers to estimate the class membership probabilities in SFM and MRM can significantly enhance their generalisation ability. 相似文献
3.
The positive unlabeled learning term refers to the binary classification problem in the absence of negative examples. When only positive and unlabeled instances are available, semi-supervised classification algorithms cannot be directly applied, and thus new algorithms are required. One of these positive unlabeled learning algorithms is the positive naive Bayes (PNB), which is an adaptation of the naive Bayes induction algorithm that does not require negative instances. In this work we propose two ways of enhancing this algorithm. On one hand, we have taken the concept behind PNB one step further, proposing a procedure to build more complex Bayesian classifiers in the absence of negative instances. We present a new algorithm (named positive tree augmented naive Bayes, PTAN) to obtain tree augmented naive Bayes models in the positive unlabeled domain. On the other hand, we propose a new Bayesian approach to deal with the a priori probability of the positive class that models the uncertainty over this parameter by means of a Beta distribution. This approach is applied to both PNB and PTAN, resulting in two new algorithms. The four algorithms are empirically compared in positive unlabeled learning problems based on real and synthetic databases. The results obtained in these comparisons suggest that, when the predicting variables are not conditionally independent given the class, the extension of PNB to more complex networks increases the classification performance. They also show that our Bayesian approach to the a priori probability of the positive class can improve the results obtained by PNB and PTAN. 相似文献
4.
Bayesian networks (BNs) have gained increasing attention in recent years. One key issue in Bayesian networks is parameter learning. When training data is incomplete or sparse or when multiple hidden nodes exist, learning parameters in Bayesian networks becomes extremely difficult. Under these circumstances, the learning algorithms are required to operate in a high-dimensional search space and they could easily get trapped among copious local maxima. This paper presents a learning algorithm to incorporate domain knowledge into the learning to regularize the otherwise ill-posed problem, to limit the search space, and to avoid local optima. Unlike the conventional approaches that typically exploit the quantitative domain knowledge such as prior probability distribution, our method systematically incorporates qualitative constraints on some of the parameters into the learning process. Specifically, the problem is formulated as a constrained optimization problem, where an objective function is defined as a combination of the likelihood function and penalty functions constructed from the qualitative domain knowledge. Then, a gradient-descent procedure is systematically integrated with the E-step and M-step of the EM algorithm, to estimate the parameters iteratively until it converges. The experiments with both synthetic data and real data for facial action recognition show our algorithm improves the accuracy of the learned BN parameters significantly over the conventional EM algorithm. 相似文献
5.
李鹏 《计算机工程与设计》2013,34(9)
通过把概率等级的融合模型和马尔可夫随机场MRF应用于聚类分析模型上来实现图像分割方法,该方法能够更加准确地进行图像分割过程,并最终获得相关融合模型.结合先验概率分布,这种基于能量的Gibbs模型允许指定参数,最大概率等级估计与简单快捷的估计方法进行融合得到分割结果.将此融合框架成功应用在Berkeley图像标准数据库,相关实验结果表明了该方法有效的视觉评价和定量的性能指标,执行结果相比现有分割方法更为突出. 相似文献
6.
This paper presents a study of two learning criteria and two approaches to using them for training neural network classifiers, specifically a Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) networks. The first approach, which is a traditional one, relies on the use of two popular learning criteria, i.e. learning via minimising a Mean Squared Error (MSE) function or a Cross Entropy (CE) function. It is shown that the two criteria have different charcteristics in learning speed and outlier effects, and that this approach does not necessarily result in a minimal classification error. To be suitable for classification tasks, in our second approach an empirical classification criterion is introduced for the testing process while using the MSE or CE function for the training. Experimental results on several benchmarks indicate that the second approach, compared with the first, leads to an improved generalisation performance, and that the use of the CE function, compared with the MSE function, gives a faster training speed and improved or equal generalisation performance.Nomenclature
x
random input vector withd real number components [x
1 ...x
d
]
-
t
random target vector withc binary components [t
1 ...t
c
]
- y(·)
neural network function or output vector
-
parameters of a neural model
-
learning rate
-
momentum
-
decay factor
-
O
objective function
-
E
mean sum-of-squares error function
-
L
cross entropy function
-
n
nth training pattern
-
N
number of training patterns
- (·)
transfer function in a neural unit
-
z
j
output of hidden unit-j
-
a
i
activation of unit-j
-
W
ij
weight from hidden unit-j to output unit-i
-
W
jl
0
weight from input unit-l to hidden unit-j
-
j
centre vector [
j 1
...
jd
] of RBF unit-j
-
j
width vector [
j 1, ...
jd
] of RBF unit-j
-
p( ·¦·)
conditional probability function 相似文献
7.
This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with component BNs at the leaves. A RBMN is learnt using a greedy, heuristic approach akin to that used by many supervised decision tree learners, but where BNs are learnt at leaves using constructive induction. A key idea is to treat expected data as real data. This allows us to complete the database and to take advantage of a closed form for the marginal likelihood of the expected complete data that factorizes into separate marginal likelihoods for each family (a node and its parents). Our approach is evaluated on synthetic and real-world databases. 相似文献
8.
This paper proposes an approach that detects surface defects with three-dimensional characteristics on scale-covered steel blocks. The surface reflection properties of the flawless surface changes strongly. Light sectioning is used to acquire the surface range data of the steel block. These sections are arbitrarily located within a range of a few millimeters due to vibrations of the steel block on the conveyor. After the recovery of the depth map, segments of the surface are classified according to a set of extracted features by means of Bayesian network classifiers. For establishing the structure of the Bayesian network, a floating search algorithm is applied, which achieves a good tradeoff between classification performance and computational efficiency for structure learning. This search algorithm enables conditional exclusions of previously added attributes and/or arcs from the network. The experiments show that the selective unrestricted Bayesian network classifier outperforms the naïve Bayes and the tree-augmented naïve Bayes decision rules concerning the classification rate. More than 98% of the surface segments have been classified correctly. 相似文献
9.
具有丢失数据的贝叶斯网络结构学习算法 总被引:2,自引:0,他引:2
学习具有丢失数据的贝叶斯网络结构主要采用结合 EM 算法的打分一搜索方法,其效率和可靠性比较低.针对此问题建立一个新的具有丢失数据的贝叶斯网络结构学习算法.该方法首先用 Kullback-Leibler(KL)散度来表示同一结点的各个案例之间的相似程度,然后根据 Gibbs 取样来得出丢失数据的取值.最后,用启发式搜索完成贝叶斯网络结构的学习.该方法能够有效避免标准 Gibbs 取样的指数复杂性问题和现有学习方法存在的主要问题. 相似文献
10.
针对传统的循环神经网络模型在处理长期依赖问题时面临着梯度爆炸或者梯度消失的问题,且参数多训练模型时间长,提出一种基于双向GRU神经网络和贝叶斯分类器的文本分类方法。利用双向GRU神经网络提取文本特征,通过TF-IDF算法权重赋值,采用贝叶斯分类器判别分类,改进单向GRU对后文依赖性不足的缺点,减少参数,缩短模型的训练时间,提高文本分类效率。在两类文本数据上进行对比仿真实验,实验结果表明,该分类算法与传统的循环神经网络相比能够有效提高文本分类的效率和准确率。 相似文献
11.
王双成 《计算机工程与应用》2005,41(18):11-12,187
给出了变量之间k阶分类能力的概念及计算方法,并证明了k阶分类能力就是k阶分类正确率,以及k阶分类能力和条件独立性的等价性,在此基础上构造出基于分类能力的贝叶斯网络结构打分函数,同时结合依赖分析方法和打分-搜索方法建立了有效的贝叶斯网络结构学习方法,实验结果显示该方法能够有效地进行贝叶斯网络结构学习,并使学习得到的结构倾向于简单化。 相似文献
12.
13.
Integrating ontological modelling and Bayesian inference for pattern classification in topographic vector data 总被引:1,自引:0,他引:1
Patrick Lüscher Robert Weibel Dirk Burghardt 《Computers, Environment and Urban Systems》2009,33(5):363
This paper presents an ontology-driven approach for spatial database enrichment in support of map generalisation. Ontology-driven spatial database enrichment is a promising means to provide better transparency, flexibility and reusability in comparison to purely algorithmic approaches. Geographic concepts manifested in spatial patterns are formalised by means of ontologies that are used to trigger appropriate low level pattern recognition techniques. The paper focuses on inference in the presence of vagueness, which is common in definitions of spatial phenomena, and on the influence of the complexity of spatial measures on classification accuracy. The concept of the English terraced house serves as an example to demonstrate how geographic concepts can be modelled in an ontology for spatial database enrichment. Owing to their good integration into ontologies, and their ability to deal with vague definitions, supervised Bayesian inference is used for inferring complex concepts. The approach is validated in experiments using large vector datasets representing buildings of four different cities. We compare classification results obtained with the proposed approach to results produced by a more traditional ontology approach. The proposed approach performed considerably better in comparison to the traditional ontology approach. Besides clarifying the benefits of using ontologies in spatial database enrichment, our research demonstrates that Bayesian networks are a suitable method to integrate vague knowledge about conceptualisations in cartography and GIScience. 相似文献
14.
针对网页自动分类中存在的类边界模糊、语料不均匀等引起的分类不确定性问题,提出了贝叶斯网络自动分类融合模型和融合算法,该模型和算法基于网页上多种信息进行融合,并采用不同的与处理方法分别对多种信息进行处理,将处理后的信息输入到贝叶斯网络融合中心进行融合推理,得到最终的分类结果。同时,为了降低贝叶斯网络推理时间复杂度,提出了改进的贝叶斯网络图推理算法。实验结果表明,改进后的融合模型和融合算法能有效解决网页自动分类中的不确定性问题,并能提高网页自动分类的准确率和查全率。 相似文献
15.
Exploiting Bivariate Dependencies to Speedup Structure Learning in Bayesian Optimization Algorithm
下载免费PDF全文

Bayesian optimization algorithm (BOA) is one of the successful and widely used estimation of distribution algorithms (EDAs) which have been employed to solve different optimization problems. In EDAs, a model is learned from the selected population that encodes interactions among problem variables. New individuals are generated by sampling the model and incorporated into the population. Different probabilistic models have been used in EDAs to learn interactions. Bayesian network (BN) is a well-known graphical model which is used in BOA. Learning a proper model in EDAs and particularly in BOA is distinguished as a computationally expensive task. Different methods have been proposed in the literature to improve the complexity of model building in EDAs. This paper employs bivariate dependencies to learn accurate BNs in BOA efficiently. The proposed approach extracts the bivariate dependencies using an appropriate pairwise interaction-detection metric. Due to the static structure of the underlying problems, these dependencies are used in each generation of BOA to learn an accurate network. By using this approach, the computational cost of model building is reduced dramatically. Various optimization problems are selected to be solved by the algorithm. The experimental results show that the proposed approach successfully finds the optimum in problems with different types of interactions efficiently. Significant speedups are observed in the model building procedure as well. 相似文献
16.
针对混合算法学习贝叶斯网络结构存在易陷入局部最优、搜索精度低等问题,提出了采用蝙蝠算法和约束结合的贝叶斯网络结构混合算法。首先应用最大最小父子(Max-min parents and children,MMPC)节点集合构建初始无向网络的框架,然后利用蝙蝠算法进行评分搜索并确定网络结构中边的方向。最后应用上述算法学习ALARM网,并和最大最小爬山(the max-min hill climbing,MMHC)算法,贪婪搜索算法相比较,结果表明在增加边、反转边、删除边以及结构海明距离方面都有不同程度的减少,表明改进算法具有较强的学习能力和良好的收敛速度。 相似文献
17.
吴绍兵 《计算机与数字工程》2012,40(11):108-111
我国正处于国际国内形势复杂、人民内部矛盾凸显、刑事犯罪高发的时期,刑侦部门打击刑事犯罪、保护人民生命财产、维护社会稳定的任务更加繁重而艰巨。在这种形势下,探讨影响刑事犯罪的因素,更好地打击刑事犯罪,维护国家的安全和社会的稳定就显得尤为重要。文章采用贝叶斯网络和EM算法来分析影响某地区刑事犯罪的影响因素。给出了影响因素模型,从模型的结果来看,影响该地区的刑事犯罪的因素依次是人的因素、环境因素、犯罪类型因素、犯罪位置因素和交通因素等。研究结果可用于为领导决策提供理论指导和参考。 相似文献
18.
针对数据缺失条件下构建贝叶斯网络难度大的问题,研究了贝叶斯结构学习算法,提出了将条件独立性检验和评分-搜索相结合的算法.采用改进的混合算法对训练数据初始化,建立相应的初始网络,对已经拟合了训练数据信息的初始网络用遗传模拟退火算法进行训练以找到最佳的网络结构.给出了算法实施的具体步骤且通过实验验证了算法性能,并将实验结果与其他典型的算法进行比较,表明了算法具有更优的学习效果. 相似文献
19.
研究算法改进,提高计算性能,贝叶斯网络是解决不确定性问题的一种有效方法,在很多领域得到了广泛应用。参数学习是贝叶斯网络构建的重要环节,但含隐变量、连续变量的参数学习是非常困难的。为解决上述问题,提出了一种人工鱼群算法的贝叶斯网络参数学习方法,并进一步通过调整人工鱼随机移动速度的方法提高了算法的收敛性能和速度。最后,将参数学习方法在由Noisy-Or和Noisy-And节点组成的贝叶斯网络中进行了仿真,仿真结果表明了参数学习方法,特别是改进后方法的可行性和优越性。 相似文献
20.
采用遗传算法建立贝叶斯网络的优化学习结构,一直是贝叶斯网络研究倍受关注的课题.传统遗传算法的个体设计存在需要反复进行无环性检验的问题,降低了进化效率.针对这个问题,提出一种新的个体编码方式.考虑到进化过程中家族得分的可继承性,提出基于家族继承的结构评分改进算法,进而设计相应的改进遗传算法.实验结果表明,改进算法在BN建网精度与效率上都得到明显提升. 相似文献