共查询到20条相似文献,搜索用时 15 毫秒
1.
Chaoqun Li 《人工智能实验与理论杂志》2013,25(4):477-491
A large number of distance metrics have been proposed to measure the difference of two instances. Among these metrics, Short and Fukunaga metric (SFM) and minimum risk metric (MRM) are two probability-based metrics which are widely used to find reasonable distance between each pair of instances with nominal attributes only. For simplicity, existing works use naive Bayesian (NB) classifiers to estimate class membership probabilities in SFM and MRM. However, it has been proved that the ability of NB classifiers to class probability estimation is poor. In order to scale up the classification performance of NB classifiers, many augmented NB classifiers are proposed. In this paper, we study the class probability estimation performance of these augmented NB classifiers and then use them to estimate the class membership probabilities in SFM and MRM. The experimental results based on a large number of University of California, Irvine (UCI) data-sets show that using these augmented NB classifiers to estimate the class membership probabilities in SFM and MRM can significantly enhance their generalisation ability. 相似文献
2.
Boosted Bayesian network classifiers 总被引:2,自引:0,他引:2
The use of Bayesian networks for classification problems has received a significant amount of recent attention. Although computationally
efficient, the standard maximum likelihood learning method tends to be suboptimal due to the mismatch between its optimization
criteria (data likelihood) and the actual goal of classification (label prediction accuracy). Recent approaches to optimizing
classification performance during parameter or structure learning show promise, but lack the favorable computational properties
of maximum likelihood learning. In this paper we present boosted Bayesian network classifiers, a framework to combine discriminative
data-weighting with generative training of intermediate models. We show that boosted Bayesian network classifiers encompass
the basic generative models in isolation, but improve their classification performance when the model structure is suboptimal.
We also demonstrate that structure learning is beneficial in the construction of boosted Bayesian network classifiers. On
a large suite of benchmark data-sets, this approach outperforms generative graphical models such as naive Bayes and TAN in
classification accuracy. Boosted Bayesian network classifiers have comparable or better performance in comparison to other
discriminatively trained graphical models including ELR and BNC. Furthermore, boosted Bayesian networks require significantly
less training time than the ELR and BNC algorithms. 相似文献
3.
The positive unlabeled learning term refers to the binary classification problem in the absence of negative examples. When only positive and unlabeled instances are available, semi-supervised classification algorithms cannot be directly applied, and thus new algorithms are required. One of these positive unlabeled learning algorithms is the positive naive Bayes (PNB), which is an adaptation of the naive Bayes induction algorithm that does not require negative instances. In this work we propose two ways of enhancing this algorithm. On one hand, we have taken the concept behind PNB one step further, proposing a procedure to build more complex Bayesian classifiers in the absence of negative instances. We present a new algorithm (named positive tree augmented naive Bayes, PTAN) to obtain tree augmented naive Bayes models in the positive unlabeled domain. On the other hand, we propose a new Bayesian approach to deal with the a priori probability of the positive class that models the uncertainty over this parameter by means of a Beta distribution. This approach is applied to both PNB and PTAN, resulting in two new algorithms. The four algorithms are empirically compared in positive unlabeled learning problems based on real and synthetic databases. The results obtained in these comparisons suggest that, when the predicting variables are not conditionally independent given the class, the extension of PNB to more complex networks increases the classification performance. They also show that our Bayesian approach to the a priori probability of the positive class can improve the results obtained by PNB and PTAN. 相似文献
4.
Bayesian networks (BNs) have gained increasing attention in recent years. One key issue in Bayesian networks is parameter learning. When training data is incomplete or sparse or when multiple hidden nodes exist, learning parameters in Bayesian networks becomes extremely difficult. Under these circumstances, the learning algorithms are required to operate in a high-dimensional search space and they could easily get trapped among copious local maxima. This paper presents a learning algorithm to incorporate domain knowledge into the learning to regularize the otherwise ill-posed problem, to limit the search space, and to avoid local optima. Unlike the conventional approaches that typically exploit the quantitative domain knowledge such as prior probability distribution, our method systematically incorporates qualitative constraints on some of the parameters into the learning process. Specifically, the problem is formulated as a constrained optimization problem, where an objective function is defined as a combination of the likelihood function and penalty functions constructed from the qualitative domain knowledge. Then, a gradient-descent procedure is systematically integrated with the E-step and M-step of the EM algorithm, to estimate the parameters iteratively until it converges. The experiments with both synthetic data and real data for facial action recognition show our algorithm improves the accuracy of the learned BN parameters significantly over the conventional EM algorithm. 相似文献
5.
朴素贝叶斯分类器不能有效地利用属性之间的依赖信息, 而目前所进行的依赖扩展更强调效率, 使扩展后分类器的分类准确性还有待提高. 针对以上问题, 在使用具有平滑参数的高斯核函数估计属性密度的基础上, 结合分类器的分类准确性标准和属性父结点的贪婪选择, 进行朴素贝叶斯分类器的网络依赖扩展. 使用UCI 中的连续属性分类数据进行实验, 结果显示网络依赖扩展后的分类器具有良好的分类准确性.
相似文献6.
李鹏 《计算机工程与设计》2013,34(9)
通过把概率等级的融合模型和马尔可夫随机场MRF应用于聚类分析模型上来实现图像分割方法,该方法能够更加准确地进行图像分割过程,并最终获得相关融合模型.结合先验概率分布,这种基于能量的Gibbs模型允许指定参数,最大概率等级估计与简单快捷的估计方法进行融合得到分割结果.将此融合框架成功应用在Berkeley图像标准数据库,相关实验结果表明了该方法有效的视觉评价和定量的性能指标,执行结果相比现有分割方法更为突出. 相似文献
7.
针对装备在不同配置及使用环境的条件下运行的故障率等级差异,详细介绍并分析了现有各贝叶斯分类器的特点和构建算法。在此基础上,提出了基于贝叶斯网络的产品故障分类模型建模方法用于指导实际分类任务的模型建立和应用。通过法国某装备生产企业的实例分析,实验结果证明在所有的贝叶斯网络分类器及传统的决策树C4.5分类器中,树型朴素贝叶斯分类器能够取得最好的分类效果,并为后续的维修资源配置及产品运行能力优化提供有效的理论支持。 相似文献
8.
This paper presents a study of two learning criteria and two approaches to using them for training neural network classifiers, specifically a Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) networks. The first approach, which is a traditional one, relies on the use of two popular learning criteria, i.e. learning via minimising a Mean Squared Error (MSE) function or a Cross Entropy (CE) function. It is shown that the two criteria have different charcteristics in learning speed and outlier effects, and that this approach does not necessarily result in a minimal classification error. To be suitable for classification tasks, in our second approach an empirical classification criterion is introduced for the testing process while using the MSE or CE function for the training. Experimental results on several benchmarks indicate that the second approach, compared with the first, leads to an improved generalisation performance, and that the use of the CE function, compared with the MSE function, gives a faster training speed and improved or equal generalisation performance.Nomenclature
x
random input vector withd real number components [x
1 ...x
d
]
-
t
random target vector withc binary components [t
1 ...t
c
]
- y(·)
neural network function or output vector
-
parameters of a neural model
-
learning rate
-
momentum
-
decay factor
-
O
objective function
-
E
mean sum-of-squares error function
-
L
cross entropy function
-
n
nth training pattern
-
N
number of training patterns
- (·)
transfer function in a neural unit
-
z
j
output of hidden unit-j
-
a
i
activation of unit-j
-
W
ij
weight from hidden unit-j to output unit-i
-
W
jl
0
weight from input unit-l to hidden unit-j
-
j
centre vector [
j 1
...
jd
] of RBF unit-j
-
j
width vector [
j 1, ...
jd
] of RBF unit-j
-
p( ·¦·)
conditional probability function 相似文献
9.
10.
作为概率图模型,无限制多维贝叶斯网络分类器(GMBNC)是贝叶斯网络(BN)应用在多维分类应用时的精简模型,只包含对预测有效的局部结构.为了获得GMBNC,传统方法是先学习全局BN;为了避免全局搜索,提出了仅执行局部搜索的结构学习算法DOS-GMBNC.该算法继承了之前提出的IPC-GMBNC算法的主体框架,基于进一步挖掘的结构拓扑信息来动态调整搜索次序,以避免执行无效用的计算.实验研究验证了DOS-GMBNC算法的效果和效率:(1)该算法输出的网络质量与IPC-GMBNC一致,优于经典的PC算法;(2)在一个包含100个节点的问题中,该算法相对于PC和IPC-GMBNC算法分别节省了近89%和45%的计算量. 相似文献
11.
This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with component BNs at the leaves. A RBMN is learnt using a greedy, heuristic approach akin to that used by many supervised decision tree learners, but where BNs are learnt at leaves using constructive induction. A key idea is to treat expected data as real data. This allows us to complete the database and to take advantage of a closed form for the marginal likelihood of the expected complete data that factorizes into separate marginal likelihoods for each family (a node and its parents). Our approach is evaluated on synthetic and real-world databases. 相似文献
12.
13.
The "Bottleneck" Behaviours in Linear Feedforward Neural Network Classifiers and Their Breakthrough 下载免费PDF全文
HUANG Deshuang 《计算机科学技术学报》1999,14(1):34-43
The classification mechanisms of linear feedforward neural network classifiers(FNNC),whose hidden layer performs the Fisher linear transformation of the input patterns,under to supervision of outer-supervised signals are investigated.The“bottleneck”behaviours in linear FNNCs are observed and analyzed.In addition,the structure stabilities of the linear FNNCs are also discussed.It is pointed out that the key point to break through the “bottleneck” behaviours for linear FNNCs is to change linear hidden neurons into nonlinear hidden ones.Finally,the experimental results,taking the parity 3 problem as example,are given. 相似文献
14.
15.
This paper proposes an approach that detects surface defects with three-dimensional characteristics on scale-covered steel blocks. The surface reflection properties of the flawless surface changes strongly. Light sectioning is used to acquire the surface range data of the steel block. These sections are arbitrarily located within a range of a few millimeters due to vibrations of the steel block on the conveyor. After the recovery of the depth map, segments of the surface are classified according to a set of extracted features by means of Bayesian network classifiers. For establishing the structure of the Bayesian network, a floating search algorithm is applied, which achieves a good tradeoff between classification performance and computational efficiency for structure learning. This search algorithm enables conditional exclusions of previously added attributes and/or arcs from the network. The experiments show that the selective unrestricted Bayesian network classifier outperforms the naïve Bayes and the tree-augmented naïve Bayes decision rules concerning the classification rate. More than 98% of the surface segments have been classified correctly. 相似文献
16.
摘 要: 多维分类根据数据实例的特征向量将数据实例在多个维度上进行分类,具有广泛的应用前景。在多维分类算法的模型学习过程中,海量的训练数据使得准确的分类算法需要很长的模型训练时间。为了提高多维分类的执行效率,同时保持高的预测准确性,本文提出了一种基于贝叶斯网络的多维分类学习方法。首先,将多维分类问题描述为条件概率分布问题。其次,根据类别向量之间的依赖关系建立了条件树贝叶斯网络模型。最后,根据训练数据集对条件树贝叶斯网络模型的结构和参数进行学习,并提出了一种多维分类预测算法。大量的真实数据集实验表明,本文提出的方法与当前最好的多维分类算法MMOC相比,在保持高准确性的同时将模型的训练时间降低了两个数量级。因此,本文提出的方法更适用于海量数据的多维分类应用中。 相似文献
17.
针对传统的循环神经网络模型在处理长期依赖问题时面临着梯度爆炸或者梯度消失的问题,且参数多训练模型时间长,提出一种基于双向GRU神经网络和贝叶斯分类器的文本分类方法。利用双向GRU神经网络提取文本特征,通过TF-IDF算法权重赋值,采用贝叶斯分类器判别分类,改进单向GRU对后文依赖性不足的缺点,减少参数,缩短模型的训练时间,提高文本分类效率。在两类文本数据上进行对比仿真实验,实验结果表明,该分类算法与传统的循环神经网络相比能够有效提高文本分类的效率和准确率。 相似文献
18.
具有丢失数据的贝叶斯网络结构学习算法 总被引:2,自引:0,他引:2
学习具有丢失数据的贝叶斯网络结构主要采用结合 EM 算法的打分一搜索方法,其效率和可靠性比较低.针对此问题建立一个新的具有丢失数据的贝叶斯网络结构学习算法.该方法首先用 Kullback-Leibler(KL)散度来表示同一结点的各个案例之间的相似程度,然后根据 Gibbs 取样来得出丢失数据的取值.最后,用启发式搜索完成贝叶斯网络结构的学习.该方法能够有效避免标准 Gibbs 取样的指数复杂性问题和现有学习方法存在的主要问题. 相似文献
19.
朴素贝叶斯分类器可以应用于岩性识别.该算法常使用高斯分布来拟合连续属性的概率分布,但是对于复杂的测井数据,高斯分布的拟合效果欠佳.针对该问题,提出基于EM算法的混合高斯概率密度估计.实验选取苏东41-33区块下古气井的测井数据作为训练样本,并选取44-45号井数据作为测试样本.实验采用基于EM算法的混合高斯模型来对测井数据变量进行概率密度估计,并将其应用到朴素贝叶斯分类器中进行岩性识别,最后用高斯分布函数的拟合效果作为对比.结果表明混合高斯模型具有更好的拟合效果,对于朴素贝叶斯分类器进行岩性识别的性能有不错的提升. 相似文献
20.
Shangfei WANG Menghua HE Yachen ZHU Shan HE Yue LIU Qiang JI 《Frontiers of Computer Science》2015,9(2):185
For many supervised learning applications, additional information, besides the labels, is often available during training, but not available during testing. Such additional information, referred to the privileged information, can be exploited during training to construct a better classifier. In this paper, we propose a Bayesian network (BN) approach for learning with privileged information. We propose to incorporate the privileged information through a three-node BN. We further mathematically evaluate different topologies of the three-node BN and identify those structures, through which the privileged information can benefit the classification. Experimental results on handwritten digit recognition, spontaneous versus posed expression recognition, and gender recognition demonstrate the effectiveness of our approach. 相似文献