首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
Markov logic networks   总被引:16,自引:0,他引:16  
We propose a simple approach to combining first-order logic and probabilistic graphical models in a single representation. A Markov logic network (MLN) is a first-order knowledge base with a weight attached to each formula (or clause). Together with a set of constants representing objects in the domain, it specifies a ground Markov network containing one feature for each possible grounding of a first-order formula in the KB, with the corresponding weight. Inference in MLNs is performed by MCMC over the minimal subset of the ground network required for answering the query. Weights are efficiently learned from relational databases by iteratively optimizing a pseudo-likelihood measure. Optionally, additional clauses are learned using inductive logic programming techniques. Experiments with a real-world database and knowledge base in a university domain illustrate the promise of this approach. Editors: Hendrik Blockeel, David Jensen and Stefan Kramer An erratum to this article is available at .  相似文献   

3.
Recently, there has been an increasing interest in generative models that represent probabilistic patterns over both links and attributes. A common characteristic of relational data is that the value of a predicate often depends on values of the same predicate for related entities. For directed graphical models, such recursive dependencies lead to cycles, which violates the acyclicity constraint of Bayes nets. In this paper we present a new approach to learning directed relational models which utilizes two key concepts: a pseudo likelihood measure that is well defined for recursive dependencies, and the notion of stratification from logic programming. An issue for modelling recursive dependencies with Bayes nets are redundant edges that increase the complexity of learning. We propose a new normal form format that removes the redundancy, and prove that assuming stratification, the normal form constraints involve no loss of modelling power. Empirical evaluation compares our approach to learning recursive dependencies with undirected models (Markov Logic Networks). The Bayes net approach is orders of magnitude faster, and learns more recursive dependencies, which lead to more accurate predictions.  相似文献   

4.
Hybrid Bayesian estimation tree learning with discrete and fuzzy labels   总被引:1,自引:1,他引:0  
Classical decision tree model is one of the classical machine learning models for its simplicity and effectiveness in applications. However, compared to the DT model, probability estimation trees (PETs) give a better estimation on class probability. In order to get a good probability estimation, we usually need large trees which are not desirable with respect to model transparency. Linguistic decision tree (LDT) is a PET model based on label semantics. Fuzzy labels are used for building the tree and each branch is associated with a probability distribution over classes. If there is no overlap between neighboring fuzzy labels, these fuzzy labels then become discrete labels and a LDT with discrete labels becomes a special case of the PET model. In this paper, two hybrid models by combining the naive Bayes classifier and PETs are proposed in order to build a model with good performance without losing too much transparency. The first model uses naive Bayes estimation given a PET, and the second model uses a set of small-sized PETs as estimators by assuming the independence between these trees. Empirical studies on discrete and fuzzy labels show that the first model outperforms the PET model at shallow depth, and the second model is equivalent to the naive Bayes and PET.  相似文献   

5.
马尔可夫逻辑网络是将马尔可夫网络与一阶逻辑相结合的一种统计关系学习模型,在自然语言处理、复杂网络、信息抽取等领域都有重要的应用前景.较为全面、深入地总结了马尔可夫逻辑网络的理论模型、推理、权重和结构学习,最后指出了马尔可夫逻辑网络未来的主要研究方向.  相似文献   

6.
The random-trial incremental (RTI) model of human associative learning proposes that learning due to a trial where the association is presented proceeds incrementally; but with a certain probability, constant across trials, no learning occurs due to a trial. Based on RTI, identifying a policy for sequencing presentation trials of different associations for maximizing overall learning can be accomplished via a Markov decision process. For both finite and infinite horizons and a quite general structure of costs and rewards, a policy that on each trial presents an association that leads to the maximum expected immediate net reward is optimal  相似文献   

7.
统计关系学习模型Markov逻辑网综述   总被引:1,自引:0,他引:1  
统计关系学习是人工智能研究的热点,在生物信息学、地理信息系统和自然语言理解等领域有着重要应用,Markov逻辑网是将Markov网与一阶逻辑相结合的一种全新的统计关系学习模型。介绍了Markov逻辑网的理论模型和学习方法,并探讨了目前存在的问题和研究方向。  相似文献   

8.
In real-world classification problems, different types of misclassification errors often have asymmetric costs, thus demanding cost-sensitive learning methods that attempt to minimize average misclassification cost rather than plain error rate. Instance weighting and post hoc threshold adjusting are two major approaches to cost-sensitive classifier learning. This paper compares the effects of these two approaches on several standard, off-the-shelf classification methods. The comparison indicates that the two approaches lead to similar results for some classification methods, such as Naïve Bayes, logistic regression, and backpropagation neural network, but very different results for other methods, such as decision tree, decision table, and decision rule learners. The findings from this research have important implications on the selection of the cost-sensitive classifier learning approach as well as on the interpretation of a recently published finding about the relative performance of Naïve Bayes and decision trees.  相似文献   

9.
Using AUC and accuracy in evaluating learning algorithms   总被引:14,自引:0,他引:14  
The area under the ROC (receiver operating characteristics) curve, or simply AUC, has been traditionally used in medical diagnosis since the 1970s. It has recently been proposed as an alternative single-number measure for evaluating the predictive ability of learning algorithms. However, no formal arguments were given as to why AUC should be preferred over accuracy. We establish formal criteria for comparing two different measures for learning algorithms and we show theoretically and empirically that AUC is a better measure (defined precisely) than accuracy. We then reevaluate well-established claims in machine learning based on accuracy using AUC and obtain interesting and surprising new results. For example, it has been well-established and accepted that Naive Bayes and decision trees are very similar in predictive accuracy. We show, however, that Naive Bayes is significantly better than decision trees in AUC. The conclusions drawn in this paper may make a significant impact on machine learning and data mining applications.  相似文献   

10.
Machine Learning for Intelligent Processing of Printed Documents   总被引:1,自引:0,他引:1  
A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In intelligent systems for paper document processing this information capture process is based on knowledge of the specific layout and logical structures of the documents. This article proposes the application of machine learning techniques to acquire the specific knowledge required by an intelligent document processing system, named WISDOM++, that manages printed documents, such as letters and journals. Knowledge is represented by means of decision trees and first-order rules automatically generated from a set of training documents. In particular, an incremental decision tree learning system is applied for the acquisition of decision trees used for the classification of segmented blocks, while a first-order learning system is applied for the induction of rules used for the layout-based classification and understanding of documents. Issues concerning the incremental induction of decision trees and the handling of both numeric and symbolic data in first-order rule learning are discussed, and the validity of the proposed solutions is empirically evaluated by processing a set of real printed documents.  相似文献   

11.
Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an instance are conditionally independent given the class of that instance. When this assumption is violated (which is often the case in practice) it can reduce classification accuracy due to “information double-counting” and interaction omission. In this paper we focus on a relatively new set of models, termed Hierarchical Naïve Bayes models. Hierarchical Naïve Bayes models extend the modeling flexibility of Naïve Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naïve Bayes models in the context of classification. Experimental results show that the learned models can significantly improve classification accuracy as compared to other frameworks.  相似文献   

12.
基于隐马尔科夫模型的基因识别系统设计与实现   总被引:2,自引:0,他引:2  
随着基因组研究的发展,利用机器学习方法进行基因识别被广泛使用,这些方法包括神经网络算法、基于规则的方法、决策树、概率推理等。文章描述了一种基于隐马尔科夫模型的基因识别系统,介绍了EM训练算法和Viterbi序列分析算法,该系统运用Burset&Guigo的公共数据集进行测试,核苷识别的Sn和Sp两个参数分别可以达到68%和88%,外显子识别的Sn和Sp参数达到60%和63%。  相似文献   

13.
Variational Bayes learning or mean field approximation is widely used in statistical models which are made of mixtures of exponential distributions, for example, normal mixtures, binomial mixtures, and hidden Markov models. To derive variational Bayes learning algorithm, we need to determine the hyperparameters in the a priori distribution; however, the design method of hyperparameters has not yet been established. In the present paper, we propose two different design methods of hyperparameters which are applied to the different purposes. In the former method, the hyperparameter is determined for minimization of the generalization error. In the latter method, it is chosen so that candidates of hidden structure in training data are extracted. It is experimentally shown that the optimal hyperparameters for two purposes are different from each other.  相似文献   

14.
Combining Classifiers with Meta Decision Trees   总被引:4,自引:0,他引:4  
The paper introduces meta decision trees (MDTs), a novel method for combining multiple classifiers. Instead of giving a prediction, MDT leaves specify which classifier should be used to obtain a prediction. We present an algorithm for learning MDTs based on the C4.5 algorithm for learning ordinary decision trees (ODTs). An extensive experimental evaluation of the new algorithm is performed on twenty-one data sets, combining classifiers generated by five learning algorithms: two algorithms for learning decision trees, a rule learning algorithm, a nearest neighbor algorithm and a naive Bayes algorithm. In terms of performance, stacking with MDTs combines classifiers better than voting and stacking with ODTs. In addition, the MDTs are much more concise than the ODTs and are thus a step towards comprehensible combination of multiple classifiers. MDTs also perform better than several other approaches to stacking.  相似文献   

15.
In multi-label classification, examples can be associated with multiple labels simultaneously. The task of learning from multi-label data can be addressed by methods that transform the multi-label classification problem into several single-label classification problems. The binary relevance approach is one of these methods, where the multi-label learning task is decomposed into several independent binary classification problems, one for each label in the set of labels, and the final labels for each example are determined by aggregating the predictions from all binary classifiers. However, this approach fails to consider any dependency among the labels. Aiming to accurately predict label combinations, in this paper we propose a simple approach that enables the binary classifiers to discover existing label dependency by themselves. An experimental study using decision trees, a kernel method as well as Naïve Bayes as base-learning techniques shows the potential of the proposed approach to improve the multi-label classification performance.  相似文献   

16.
Current learning approaches to computer vision have mainly focussed on low-level image processing and object recognition, while tending to ignore high-level processing such as understanding. Here we propose an approach to object recognition that facilitates the transition from recognition to understanding. The proposed approach embraces the synergistic spirit of soft computing, exploiting the global search powers of genetic programming to determine fuzzy probabilistic models. It begins by segmenting the images into regions using standard image processing approaches, which are subsequently classified using a discovered fuzzy Cartesian granule feature classifier. Understanding is made possible through the transparent and succinct nature of the discovered models. The recognition of roads in images is taken as an illustrative problem in the vision domain. The discovered fuzzy models while providing high levels of accuracy (97%), also provide understanding of the problem domain through the transparency of the learnt models. The learning step in the proposed approach is compared with other techniques such as decision trees, naïve Bayes and neural networks using a variety of performance criteria such as accuracy, understandability and efficiency.  相似文献   

17.
In the distribution-independent model of concept learning of Valiant, Angluin and Laird have introduced a formal model of noise process, called classification noise process, to study how to compensate for randomly introduced errors, or noise, in classifying the example data. In this article, we investigate the problem of designing efficient learning algorithms in the presence of classification noise. First, we develop a technique of building efficient robust learning algorithms, called noise-tolerant Occam algorithms, and show that using them, one can construct a polynomial-time algorithm for learning a class of Boolean functions in the presence of classification noise. Next, as an instance of such problems of learning in the presence of classification noise, we focus on the learning problem of Boolean functions represented by decision trees. We present a noise-tolerant Occam algorithm for k-DL (the class of decision lists with conjunctive clauses of size at most k at each decision introduced by Rivest) and hence conclude that k-DL is polynomially learnable in the presence of classification noise. Further, we extend the noise-tolerant Occam algorithm for k-DL to one for r-DT (the class of decision trees of rank at most r introduced by Ehrenfeucht and Haussler) and conclude that r-DT is polynomially learnable in the presence of classification noise.  相似文献   

18.
Many important real-world applications of machine learning, statistical physics, constraint programming and information theory can be formulated using graphical models that involve determinism and cycles. Accurate and efficient inference and training of such graphical models remains a key challenge. Markov logic networks (MLNs) have recently emerged as a popular framework for expressing a number of problems which exhibit these properties. While loopy belief propagation (LBP) can be an effective solution in some cases; unfortunately, when both determinism and cycles are present, LBP frequently fails to converge or converges to inaccurate results. As such, sampling based algorithms have been found to be more effective and are more popular for general inference tasks in MLNs. In this paper, we introduce Generalized arc-consistency Expectation Maximization Message-Passing (GEM-MP), a novel message-passing approach to inference in an extended factor graph that combines constraint programming techniques with variational methods. We focus our experiments on Markov logic and Ising models but the method is applicable to graphical models in general. In contrast to LBP, GEM-MP formulates the message-passing structure as steps of variational expectation maximization. Moreover, in the algorithm we leverage the local structures in the factor graph by using generalized arc consistency when performing a variational mean-field approximation. Thus each such update increases a lower bound on the model evidence. Our experiments on Ising grids, entity resolution and link prediction problems demonstrate the accuracy and convergence of GEM-MP over existing state-of-the-art inference algorithms such as MC-SAT, LBP, and Gibbs sampling, as well as convergent message passing algorithms such as the concave–convex procedure, residual BP, and the L2-convex method.  相似文献   

19.
Functional Trees   总被引:1,自引:0,他引:1  
In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component.  相似文献   

20.
链接预测是对实体间的关系进行预测,是一个重要而复杂的任务。传统同类独立同概率分布的方法会带来很大的噪音,导致预测效果很差。将Markov逻辑网应用到链接预测中,旨在改善这一问题。Markov逻辑网是将Markov网与一阶逻辑结合的统计关系学习方法。利用Markov逻辑网构建关系模型,对实体之间是否存在链接关系以及当链接关系存在时预测此链接关系的类型。针对两个数据集的实验结果显示了采用Markov逻辑网模型要比传统链接预测模型有更好的效果,进而为Markov逻辑网解决实际问题提供了依据。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号