首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
Parallel and Sequential Algorithms for Data Mining Using Inductive Logic   总被引:4,自引:1,他引:3  
Inductive logic is a research area in the intersection of machine learning and logic programming, and has been increasingly applied to data mining. Inductive logic studies learning from examples, within the framework provided by clausal logic. It provides a uniform and expressive means of representation: examples, background knowledge, and induced theories are all expressed in first-order logic. Such an expressive representation is computationally expensive, so it is natural to consider improving the performance of inductive logic data mining using parallelism. We present a parallelization technique for inductive logic, and implement a parallel version of a core inductive logic programming system: Progol. The technique provides perfect partitioning of computation and data access and communication requirements are small, so almost linear speedup is readily achieved. However, we also show why the information flow of the technique permits superlinear speedup over the standard sequential algorithm. Performance results on several datasets and platforms are reported. The results have wider implications for the design on parallel and sequential data-mining algorithms. Received 30 August 2000 / Revised 30 January 2001 / Accepted in revised form 16 May 2001  相似文献   

2.
《Artificial Intelligence》1994,70(1-2):375-392
We present positive PAC-learning results for the nonmonotonic inductive logic programming setting. In particular, we show that first-order range-restricted clausal theories that consist of clauses with up to k literals of size at most j each are polynomial-sample polynomial-time PAC-learnable with one-sided error from positive examples only. In our framework, concepts are clausal theories and examples are finite interpretations. We discuss the problems encountered when learning theories which only have infinite nontrivial models and propose a way to avoid these problems using a representation change called flattening. Finally, we compare our results to PAC-learnability results for the normal inductive logic programming setting.  相似文献   

3.
基于归纳逻辑程序设计的学习方法及其实现的研究   总被引:1,自引:0,他引:1  
归纳逻辑程序设计是机器学习领域中的一个新方法,它研究的是从实例和背景知识进行逻辑程序(新知识)的构造.本文介绍了归纳逻辑程序设计的基本理论和方法,并介绍了这种学习方法在专家系统中的应用情况.  相似文献   

4.
Inductive logic programming (ILP) algorithms are classification algorithms that construct classifiers represented as logic programs. ILP algorithms have a number of attractive features, notably the ability to make use of declarative background (user-supplied) knowledge. However, ILP algorithms deal poorly with large data sets (>104 examples) and their widespread use of the greedy set-covering algorithm renders them susceptible to local maxima in the space of logic programs.This paper presents a novel approach to address these problems based on combining the local search properties of an inductive logic programming algorithm with the global search properties of an evolutionary algorithm. The proposed algorithm may be viewed as an evolutionary wrapper around a population of ILP algorithms.The evolutionary wrapper approach is evaluated on two domains. The chess-endgame (KRK) problem is an artificial domain that is a widely used benchmark in inductive logic programming, and Part-of-Speech Tagging is a real-world problem from the field of Natural Language Processing. In the latter domain, data originates from excerpts of the Wall Street Journal. Results indicate that significant improvements in predictive accuracy can be achieved over a conventional ILP approach when data is plentiful and noisy.  相似文献   

5.
李艳娟  郭茂祖 《电脑学习》2012,2(3):13-17,22
归纳逻辑程序设计是机器学习与逻辑程序设计交叉所形成的一个研究领域,克服了传统机器学习方法的两个主要限制:即知识表示的限制和背景知识利用的限制,成为机器学习的前沿研究课题。首先从归纳逻辑程序设计的产生背景、定义、应用领域及问题背景介绍了归纳逻辑程序设计系统的概貌,对归纳逻辑程序设计方法的研究现状进行了总结和分析,最后探讨了该领域的进一步的研究方向。  相似文献   

6.
Structured machine learning: the next ten years   总被引:4,自引:1,他引:3  
The field of inductive logic programming (ILP) has made steady progress, since the first ILP workshop in 1991, based on a balance of developments in theory, implementations and applications. More recently there has been an increased emphasis on Probabilistic ILP and the related fields of Statistical Relational Learning (SRL) and Structured Prediction. The goal of the current paper is to consider these emerging trends and chart out the strategic directions and open problems for the broader area of structured machine learning for the next 10 years.  相似文献   

7.
A Multistrategy Approach to Relational Knowledge Discovery in Databases   总被引:1,自引:0,他引:1  
When learning from very large databases, the reduction of complexity is extremely important. Two extremes of making knowledge discovery in databases (KDD) feasible have been put forward. One extreme is to choose a very simple hypothesis language, thereby being capable of very fast learning on real-world databases. The opposite extreme is to select a small data set, thereby being able to learn very expressive (first-order logic) hypotheses. A multistrategy approach allows one to include most of these advantages and exclude most of the disadvantages. Simpler learning algorithms detect hierarchies which are used to structure the hypothesis space for a more complex learning algorithm. The better structured the hypothesis space is, the better learning can prune away uninteresting or losing hypotheses and the faster it becomes.We have combined inductive logic programming (ILP) directly with a relational database management system. The ILP algorithm is controlled in a model-driven way by the user and in a data-driven way by structures that are induced by three simple learning algorithms.  相似文献   

8.
Inductive logic programming (ILP) is a sub‐field of machine learning that provides an excellent framework for multi‐relational data mining applications. The advantages of ILP have been successfully demonstrated in complex and relevant industrial and scientific problems. However, to produce valuable models, ILP systems often require long running times and large amounts of memory. In this paper we address fundamental issues that have direct impact on the efficiency of ILP systems. Namely, we discuss how improvements in the indexing mechanisms of an underlying logic programming system benefit ILP performance. Furthermore, we propose novel data structures to reduce memory requirements and we suggest a new lazy evaluation technique to search the hypothesis space more efficiently. These proposals have been implemented in the April ILP system and evaluated using several well‐known data sets. The results observed show significant improvements in running time without compromising the accuracy of the models generated. Indeed, the combined techniques achieve several order of magnitudes speedup in some data sets. Moreover, memory requirements are reduced in nearly half of the data sets. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

9.
针对目前归纳逻辑程序设计(inductive logic programming,ILP)系统要求训练数据充分且无法利用无标记数据的不足,提出了一种利用无标记数据学习一阶规则的算法——关系tri-training(relational-tri-training,R-tri-training)算法。该算法将基于命题逻辑表示的半监督学习算法tri-training的思想引入到基于一阶逻辑表示的ILP系统,在ILP框架下研究如何利用无标记样例信息辅助分类器训练。R-tri-training算法首先根据标记数据和背景知识初始化三个不同的ILP系统,然后迭代地用无标记样例对三个分类器进行精化,即如果两个分类器对一个无标记样例的标记结果一致,则在一定条件下该样例将被标记给另一个分类器作为新的训练样例。标准数据集上实验结果表明:R-tri-training能有效地利用无标记数据提高学习性能,且R-tri-training算法性能优于GILP(genetic inductive logic programming)、NFOIL、KFOIL和ALEPH。  相似文献   

10.
Attribute-value based representations, standard in today's data mining systems, have a limited expressiveness. Inductive Logic Programming provides an interesting alternative, particularly for learning from structured examples whose parts, each with its own attributes, are related to each other by means of first-order predicates. Several subsets of first-order logic (FOL) with different expressive power have been proposed in Inductive Logic Programming (ILP). The challenge lies in the fact that the more expressive the subset of FOL the learner works with, the more critical the dimensionality of the learning task. The Datalog language is expressive enough to represent realistic learning problems when data is given directly in a relational database, making it a suitable tool for data mining. Consequently, it is important to elaborate techniques that will dynamically decrease the dimensionality of learning tasks expressed in Datalog, just as Feature Subset Selection (FSS) techniques do it in attribute-value learning. The idea of re-using these techniques in ILP runs immediately into a problem as ILP examples have variable size and do not share the same set of literals. We propose here the first paradigm that brings Feature Subset Selection to the level of ILP, in languages at least as expressive as Datalog. The main idea is to first perform a change of representation, which approximates the original relational problem by a multi-instance problem. The representation obtained as the result is suitable for FSS techniques which we adapted from attribute-value learning by taking into account some of the characteristics of the data due to the change of representation. We present the simple FSS proposed for the task, the requisite change of representation, and the entire method combining those two algorithms. The method acts as a filter, preprocessing the relational data, prior to the model building, which outputs relational examples with empirically relevant literals. We discuss experiments in which the method was successfully applied to two real-world domains.  相似文献   

11.
This paper demonstrates the capabilities offoidl, an inductive logic programming (ILP) system whose distinguishing characteristics are the ability to produce first-order decision lists, the use of an output completeness assumption as a substitute for negative examples, and the use originally motivated by the problem of learning to generate the past tense of English verbs; however, this paper demonstrates its superior performance on two different sets of benchmark ILP problems. Tests on the finite element mesh design problem show thatfoidl’s decision lists enable it to produce generally more accurate results than a range of methods previously applied to this problem. Tests with a selection of list-processing problems from Bratko’s introductory Prolog text demonstrate that the combination of implicit negatives and intensionality allowfoidl to learn correct programs from far fewer examples thanfoil. This research was supported by a fellowship from AT&T awarded to the first author and by the National Science Foundation under grant IRI-9310819. Mary Elaine Califf: She is currently pursuing her doctorate in Computer Science at the University of Texas at Austin where she is supported by a fellowship from AT&T. Her research interests include natural language understanding, particularly using machine learning methods to build practical natural language understanding systems such as information extraction systems, and inductive logic programming. Raymond Joseph Mooney: He is an Associate Professor of Computer Sciences at the University of Texas at Austin. He recerived his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign in 1988. His current research interests include applying machine to natural language understanding, inductive logic programming, knowledge-base and theory refinement, learning for planning, and learning for recommender systems. He serves on the editorial boards of the journalNew Generation Computing, theMachine Learning journal, theJournal of Artificial Intelligence Research, and the journalApplied Intelligence.  相似文献   

12.
Cropper  Andrew  Morel  Rolf 《Machine Learning》2021,110(4):801-856
Machine Learning - We describe an inductive logic programming (ILP) approach called learning from failures. In this approach, an ILP system (the learner) decomposes the learning problem into three...  相似文献   

13.
Introducing fuzzy predicates in inductive logic programming may serve two different purposes: allowing for more adaptability when learning classical rules or getting more expressivity by learning fuzzy rules. This latter concern is the topic of this paper. Indeed, introducing fuzzy predicates in the antecedent and in the consequent of rules may convey different non-classical meanings. The paper focuses on the learning of gradual and certainty rules, which have an increased expressive power and have no simple crisp counterpart. The benefit and the application domain of each kind of rules are discussed. Appropriate confidence degrees for each type of rules are introduced. These confidence degrees play a major role in the adaptation of the classical FOIL inductive logic programming algorithm to the induction of fuzzy rules for guiding the learning process. The method is illustrated on a benchmark example and a case-study database.  相似文献   

14.
Document image understanding denotes the recognition of semantically relevant components in the layout extracted from a document image. This recognition process is based on domain-specific knowledge that can be acquired automatically by applying data mining techniques. The spatial dimension of page layout makes classification methods developed in inductive logic programming (ILP) and multi-relational data mining (MRDM) the most suitable candidates for this specific task. In this paper, both approaches are considered and empirically compared on three different data sets consisting of multi-page articles published in an international journal and historical documents. The ILP method is able to learn recursive logical theories that express dependencies between logical components, while the MRDM method extends the naïve Bayesian classifier to data stored in multiple tables of a relational database. Experimental results confirm the importance of the spatial dimension for this application and show that the ILP method tends to be conservative with a high (low) percentage of omission (commission) errors, while the probabilistic nature of the MRDM method allows us to tradeoff between the two types of error.  相似文献   

15.
刘宙  程学先  刘宇 《微机发展》2006,16(11):28-31
语义网络数据挖掘是基于语义网络环境的数据挖掘,它给数据挖掘技术的应用研究提出了新的课题。归纳逻辑程序设计是由机器学习与逻辑程序设计交叉所形成的一个研究领域,它为知识工程等人工智能的应用领域提供了新的强有力的技术支持。分析了现有几种常用数据挖掘技术在语义Web环境下应用的局限性,提出了采用归纳逻辑程序设计(ILP)作为语义Web上适合的数据挖掘技术,给出了应用这种技术的算法描述,通过具体实例验证了其可行性。  相似文献   

16.
Inductive logic programming (ILP) induces concepts from a set of positive examples, a set of negative examples, and background knowledge. ILP has been applied on tasks such as natural language processing, finite element mesh design, network mining, robotics, and drug discovery. These data sets usually contain numerical and multivalued categorical attributes; however, only a few relational learning systems are capable of handling them in an efficient way. In this paper, we present an evolutionary approach, called Grouping and Discretization for Enriching the Background Knowledge (GDEBaK), to deal with numerical and multivalued categorical attributes in ILP. This method uses evolutionary operators to create and test numerical splits and subsets of categorical values in accordance with a fitness function. The best subintervals and subsets are added to the background knowledge before constructing candidate hypotheses. We implemented GDEBaK embedded in Aleph and compared it to lazy discretization in Aleph and discretization in Top‐down Induction of Logical Decision Trees (TILDE) systems. The results obtained showed that our method improves accuracy and reduces the number of rules in most cases. Finally, we discuss these results and possible lines for future work.  相似文献   

17.

Inductive logic programming combines both machine learning and logic programming techniques. ILP uses first-order predicate logic restricted to Horn clauses as an underlying language. Thus, programs induced by an ILP system inherit the classical limitations of PROLOG programs. Constraint logic programming avoids some of the limitations of logic programming, and so ILP aims to induce programs that employ this paradigm. Current ILP systems that induce constrained logic programs extend systems based on the normal semantics ofILP. In this article we introduce IC-Log, a new system that induces constrained logic programs and relies on an extension ofa nonmonotonic semantics-based system. We then present an application of IC-Log in the field ofcomputer-aided publishing.  相似文献   

18.
Clausal Discovery   总被引:5,自引:0,他引:5  
De Raedt  Luc  Dehaspe  Luc 《Machine Learning》1997,26(2-3):99-146
  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号