首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Relational models are the most common representation of structured data, and acyclic database theory is important in relational databases. In this paper, we propose the method for constructing the Bayesian network structure from dependencies implied in multiple relational schemas. Based on the acyclic database theory and its relationships with probabilistic networks, we are to construct the Bayesian network structure starting from implied independence information instead of mining database instances. We first give the method to find the maximum harmoniousness subset for the multi-valued dependencies on an acyclic schema, and thus the most information of conditional independencies can be retained. Further, aiming at multi-relational environments, we discuss the properties of join graphs of multiple 3NF database schemas, and thus the dependencies between separate relational schemas can be obtained. In addition, on the given cyclic join dependency, the transformation from cyclic to acyclic database schemas is proposed by virtue of finding a minimal acyclic augmentation. An applied example shows that our proposed methods are feasible.  相似文献   

2.
In this paper, we describe three Bayesian classifiers for mineral potential mapping: (a) a naive Bayesian classifier that assumes complete conditional independence of input predictor patterns, (b) an augmented naive Bayesian classifier that recognizes and accounts for conditional dependencies amongst input predictor patterns and (c) a selective naive classifier that uses only conditionally independent predictor patterns. We also describe methods for training the classifiers, which involves determining dependencies amongst predictor patterns and estimating conditional probability of each predictor pattern given the target deposit-type. The output of a trained classifier determines the extent to which an input feature vector belongs to either the mineralized class or the barren class and can be mapped to generate a favorability map. The procedures are demonstrated by an application to base metal potential mapping in the proterozoic Aravalli Province (western India). The results indicate that although the naive Bayesian classifier performs well and shows significant tolerance for the violation of the conditional independence assumption, the augmented naive Bayesian classifier performs better and exhibits finer generalization capability. The results also indicate that the rejection of conditionally dependent predictor patterns degrades the performance of a naive classifier.  相似文献   

3.
4.
Biometric systems for today's high security applications must meet stringent performance requirements; fusing multiple biometrics can help lower system error rates. Fusion methods include processing biometric modalities sequentially until an acceptable match is obtained, using logical (AND/OR) operations, or summing similarity scores. More sophisticated methods combine scores from separate classifiers for each modality. This paper develops a novel fusion architecture based on Bayesian belief networks. Although Bayesian update methods have been used before, our approach more fully exploits the graphical structure of Bayes nets to define and explicitly model statistical dependencies between relevant variables: per sample measurements, such as match scores and corresponding quality estimates, and global decision variables. These statistical dependencies are in the form of conditional distributions which we model as Gaussian, gamma, log-normal or beta, each of which is determined by its mean and variance, thus significantly reducing training data requirements. Moreover, by conditioning decision variables on quality as well as match score, we can extract information from lower quality measurements rather than rejecting them out of hand. Another feature of our method is a global quality measure designed to be used as a confidence estimate supporting decision making. Preliminary studies using the architecture to fuse fingerprints and voice are reported.  相似文献   

5.
张逸石  陈传波 《计算机科学》2011,38(12):200-205
提出了一种基于最小联合互信息亏损的最优特征选择算法。该算法首先通过一种动态渐增策略搜索一个特征全集的无差异特征子集,并基于最小条件互信息原则在保证每一步中联合互信息量亏损都最小的情况下筛选其中的冗余特征,从而得到一个近似最优特征子集。针对现有基于条件互信息的条件独立性测试方法在高维特征域上所面临的效率瓶颈问题,给出了一种用于估计条件互信息的快速实现方法,并将其用于所提算法的实现。分类实验结果表明,所提算法优于经典的特征选择算法。此外,执行效率实验结果表明,所提条件互信息的快速实现方法在执行效率上有着显著的优势。  相似文献   

6.
The agglomerative hierarchical clustering of continuous variables is studied in the framework of the likelihood linkage analysis method proposed by Lerman. The similarity between variables is defined from the process comparing the empirical copula with the independence copula in the spirit of the test of independence proposed by Deheuvels. Unlike more classical similarity coefficients for variables based on rank statistics, the comparison measure considered in this work can also be sensitive to non-monotonic dependencies. As aggregation criteria, besides classical linkages, permutation-based linkages related to procedures for combining dependent p-values are considered. The performances of the corresponding clustering algorithms are compared through thorough simulations. In order to guide the choice of a partition, a natural probabilistic selection strategy, related to the use of the gap statistic in object clustering, is proposed and empirically compared with classical ordinal approaches. The resulting variable clustering procedure can be equivalently regarded as a potentially less computationally expensive alternative to more powerful tests of multivariate independence.  相似文献   

7.
Miller DJ  Yan L 《Neural computation》2000,12(9):2175-2207
We propose a new learning method for discrete space statistical classifiers. Similar to Chow and Liu (1968) and Cheeseman (1983), we cast classification/inference within the more general framework of estimating the joint probability mass function (p.m.f.) for the (feature vector, class label) pair. Cheeseman's proposal to build the maximum entropy (ME) joint p.m.f. consistent with general lower-order probability constraints is in principle powerful, allowing general dependencies between features. However, enormous learning complexity has severely limited the use of this approach. Alternative models such as Bayesian networks (BNs) require explicit determination of conditional independencies. These may be difficult to assess given limited data. Here we propose an approximate ME method, which, like previous methods, incorporates general constraints while retaining quite tractable learning. The new method restricts joint p.m.f. support during learning to a small subset of the full feature space. Classification gains are realized over dependence trees, tree-augmented naive Bayes networks, BNs trained by the Kutato algorithm, and multilayer perceptrons. Extensions to more general inference problems are indicated. We also propose a novel exact inference method when there are several missing features.  相似文献   

8.
Feature selection is an important preprocessing step for building efficient, generalizable and interpretable classifiers on high dimensional data sets. Given the assumption on the sufficient labelled samples, the Markov Blanket provides a complete and sound solution to the selection of optimal features, by exploring the conditional independence relationships among the features. In real-world applications, unfortunately, it is usually easy to get unlabelled samples, but expensive to obtain the corresponding accurate labels on the samples. This leads to the potential waste of valuable classification information buried in unlabelled samples.In this paper, we propose a new BAyesian Semi-SUpervised Method, or BASSUM in short, to exploit the values of unlabelled samples on classification feature selection problem. Generally speaking, the inclusion of unlabelled samples helps the feature selection algorithm on (1) pinpointing more specific conditional independence tests involving fewer variable features and (2) improving the robustness of individual conditional independence tests with additional statistical information. Our experimental results show that BASSUM enhances the efficiency of traditional feature selection methods and overcomes the difficulties on redundant features in existing semi-supervised solutions.  相似文献   

9.
Nonimpeding noisy‐AND tree (NAT) models offer a highly expressive approximate representation for significantly reducing the space of Bayesian networks (BNs). They also improve efficiency of BN inference significantly. To enable these advantages for general BNs, several technical advancements are made in this work to compress target BN conditional probability tables (CPTs) over multivalued variables into NAT models. We extend the semantics of NAT models beyond graded variables that causal independence models commonly adhered to and allow NAT modeling in nominal causal variables. We overcome the limitation of well‐defined pairwise causal interaction (PCI) bits and present a flexible PCI pattern extraction from target CPTs. We extend parameter estimation for binary NAT models to constrained gradient descent for compressing target CPTs over multivalued variables. We reveal challenges associated with persistent leaky causes and develop a novel framework for PCI pattern extraction when persistent leaky causes exist. The effectiveness of the CPT compression is validated experimentally.  相似文献   

10.
A novel feature selection approach: Combining feature wrappers and filters   总被引:2,自引:0,他引:2  
Feature selection is one of the most important issues in the research fields such as system modelling, data mining and pattern recognition. In this study, a new feature selection algorithm that combines feature wrapper and feature filter approaches is proposed in order to identify the significant input variables in systems with continuous domains. The proposed method utilizes functional dependency concept, correlation coefficients and K-nearest neighbourhood (KNN) method to implement the feature filter and feature wrappers. Four feature selection methods independently select the significant input variables and the input variable combination, which yields best result with respect to their corresponding evaluation function, is selected as the winner. This is similar to the basic information fusion notion of integrating the information collected from different sources. All of the four feature selection methods are performed in two stages: (i) pre-selection, (ii) selection. Two of the four feature selection methods utilize KNN method for evaluating the candidates. These two methods use sequential forward and sequential backward search mechanism, respectively, in pre-selection stage. Whereas, the third feature selection method uses correlation coefficients in the pre-selection stage. It is common to have outliers and noise in real-life data. In order to make the proposed feature selection algorithm noise and outlier resistant, approximate functional dependencies are used by utilizing membership values that inherently cope with uncertainty in the data. Thus, the fourth feature selection method makes use of approximate functional dependencies to evaluate candidates in pre-selection stage. All of these four methods apply KNN method with exhaustive search strategy in order to find the most suitable input variable combination with respect to a performance measure.  相似文献   

11.
This paper proposes a dynamic conditional random field (DCRF) model for foreground object and moving shadow segmentation in indoor video scenes. Given an image sequence, temporal dependencies of consecutive segmentation fields and spatial dependencies within each segmentation field are unified by a dynamic probabilistic framework based on the conditional random field (CRF). An efficient approximate filtering algorithm is derived for the DCRF model to recursively estimate the segmentation field from the history of observed images. The foreground and shadow segmentation method integrates both intensity and gradient features. Moreover, models of background, shadow, and gradient information are updated adaptively for nonstationary background processes. Experimental results show that the proposed approach can accurately detect moving objects and their cast shadows even in monocular grayscale video sequences.  相似文献   

12.
In this paper, we propose a novel framework for multi-label classification, which directly models the dependencies among labels using a Bayesian network. Each node of the Bayesian network represents a label, and the links and conditional probabilities capture the probabilistic dependencies among multiple labels. We employ our Bayesian network structure learning method, which guarantees to find the global optimum structure, independent of the initial structure. After structure learning, maximum likelihood estimation is used to learn the conditional probabilities among nodes. Any current multi-label classifier can be employed to obtain the measurements of labels. Then, using the learned Bayesian network, the true labels are inferred by combining the relationship among labels with the labels? estimates obtained from a current multi-labeling method. We further extend the proposed multi-label classification method to deal with incomplete label assignments. Structural Expectation-Maximization algorithm is adopted for both structure and parameter learning. Experimental results on two benchmark multi-label databases show that our approach can effectively capture the co-occurrent and the mutual exclusive relation among labels. The relation modeled by our approach is more flexible than the pairwise or fixed subset labels captured by current multi-label learning methods. Thus, our approach improves the performance over current multi-label classifiers. Furthermore, our approach demonstrates its robustness to incomplete multi-label classification.  相似文献   

13.
由于作为朴素贝叶斯分类器的主要特征的条件独立性假设条件过强且在不同数据集上表现出的差异,所以独立性假设成为众多改进算法的切入点。但也有研究指出不满足该假设并没有对分类器造成预想的影响。从降低后验概率的估计误差入手提出一种条件熵匹配的半朴素贝叶斯分类器。实验证明,该方法能有效提高朴素贝叶斯分类器的性能。  相似文献   

14.
朴素贝叶斯分类器是一种简单而高效的分类器,但是其属性独立性假设限制了对实际数据的应用。文章提出一种新的算法,该算法为避免数据预处理时的属性约简对分类效果的直接影响,在训练集上通过随机属性选取生成若干属性子集,以这些子集构建相应的朴素贝叶斯分类器,采用模拟退火遗传算法进行优选。实验表明,与传统的朴素贝叶斯方法相比,该方法具有更好的性能。  相似文献   

15.
一种实数编码多目标贝叶斯优化算法   总被引:1,自引:0,他引:1  
提出了一种采用基于决策树概率模型表示各变量之间条件相关性的分布估算算法:实数编码多目标贝叶斯优化算法(RCMBOA)。通过构建这样的概率模型,继而对模型进行抽样以产生新个体。再对生成的新个体进行变异操作,以提高算法的搜索能力,增加种群的多样性。这种生成新个体的方法结合非劣分层与截断选择机制,可以很好地逼近多目标问题的Pareto前沿。同时,在进行截断选择时,每次只删除一个排挤距离小的个体,之后重新估算个体的排挤距离,以获得分布均匀的非劣解集。对于约束多目标优化问题,算法采用带约束支配关系判别个体的优劣。用该算法对8个较难的测试问题进行了优化计算,获得的非劣解集与NSGA-II算法得到的相比,非劣解集的质量更高,分布更为均匀。计算结果说明RCMBOA是一种有效、鲁棒的多目标优化算法。  相似文献   

16.
Many problems in vision can be formulated as Bayesian inference. It is important to determine the accuracy of these inferences and how they depend on the problem domain. In this paper, we provide a theoretical framework based on Bayesian decision theory which involves evaluating performance based on an ensemble of problem instances. We pay special attention to the task of detecting a target in the presence of background clutter. This framework is then used to analyze the detectability of curves in images. We restrict ourselves to the case where the probability models are ergodic (both for the geometry of the curve and for the imaging). These restrictions enable us to use techniques from large deviation theory to simplify the analysis. We show that the detectability of curves depend on a parameter K which is a function of the probability distributions characterizing the problem. At critical values of K the target becomes impossible to detect on average. Our framework also enables us to determine whether a simpler approximate model is sufficient to detect the target curve and hence clarify how much information is required to perform specific tasks. These results generalize our previous work (Yuille, A.L. and Coughlan, J.M. 2000. Pattern Analysis and Machine Intelligence PAMI, 22(2):160–173) by placing it in a Bayesian decision theory framework, by extending the class of probability models which can be analyzed, and by analysing the case where approximate models are used for inference.  相似文献   

17.
A decision support system for the prognosis at 24 h of head-injured patients of the intensive care unit (ICU), based on Bayesian belief networks, is constructed by model selection methods applied to a database (637 cases) of seven clinical and laboratory variables. Its performance is compared to other systems, including a simpler belief network that assumes conditional independence among the findings, and a human expert. Results indicate that its performance is not significantly different than that of the neurosurgeon expert and better than the performance of the independence model. Thus, the prognostic judgment of non-neurosurgeon ICU clinicians can be aided by the use of this system.  相似文献   

18.
本文介绍了数据库技术的现状、数据挖掘的方法以及它在Bayesian网建网技术中的应用:通过数据挖掘解决Bayesian网络建模过程中所遇到的具体问题,即如何从大规模数据库中寻找各变量之间的关系以及如何确定条件概率问题。通过将该方法应用于实际问题中的例子:绿化决策系统中如何选取树种,我们将看到此技术是有效和实用的。  相似文献   

19.
基于朴素贝叶斯与ID3算法的决策树分类   总被引:2,自引:0,他引:2       下载免费PDF全文
v在朴素贝叶斯算法和ID3算法的基础上,提出一种改进的决策树分类算法。引入客观属性重要度参数,给出弱化的朴素贝叶斯条件独立性假设,并采用加权独立信息熵作为分类属性的选取标准。理论分析和实验结果表明,改进算法能在一定程度上克服ID3算法的多值偏向问题,并且具有较高的执行效率和分类准确度。  相似文献   

20.
Attribute selection with fuzzy decision reducts   总被引:2,自引:0,他引:2  
Rough set theory provides a methodology for data analysis based on the approximation of concepts in information systems. It revolves around the notion of discernibility: the ability to distinguish between objects, based on their attribute values. It allows to infer data dependencies that are useful in the fields of feature selection and decision model construction. In many cases, however, it is more natural, and more effective, to consider a gradual notion of discernibility. Therefore, within the context of fuzzy rough set theory, we present a generalization of the classical rough set framework for data-based attribute selection and reduction using fuzzy tolerance relations. The paper unifies existing work in this direction, and introduces the concept of fuzzy decision reducts, dependent on an increasing attribute subset measure. Experimental results demonstrate the potential of fuzzy decision reducts to discover shorter attribute subsets, leading to decision models with a better coverage and with comparable, or even higher accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号