首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Three-dimensional models, or pharmacophores, describing Euclidean constraints on the location on small molecules of functional groups (like hydrophobic groups, hydrogen acceptors and donors, etc.), are often used in drug design to describe the medicinal activity of potential drugs (or ‘ligands’). This medicinal activity is produced by interaction of the functional groups on the ligand with a binding site on a target protein. In identifying structure-activity relations of this kind there are three principal issues: (1) It is often difficult to “align” the ligands in order to identify common structural properties that may be responsible for activity; (2) Ligands in solution can adopt different shapes (or `conformations’) arising from torsional rotations about bonds. The 3-D molecular substructure is typically sought on one or more low-energy conformers; and (3) Pharmacophore models must, ideally, predict medicinal activity on some quantitative scale. It has been shown that the logical representation adopted by Inductive Logic Programming (ILP) naturally resolves many of the difficulties associated with the alignment and multi-conformation issues. However, the predictions of models constructed by ILP have hitherto only been nominal, predicting medicinal activity to be present or absent. In this paper, we investigate the construction of two kinds of quantitative pharmacophoric models with ILP: (a) Models that predict the probability that a ligand is “active”; and (b) Models that predict the actual medicinal activity of a ligand. Quantitative predictions are obtained by the utilising the following statistical procedures as background knowledge: logistic regression and naive Bayes, for probability prediction; linear and kernel regression, for activity prediction. The multi-conformation issue and, more generally, the relational representation used by ILP results in some special difficulties in the use of any statistical procedure. We present the principal issues and some solutions. Specifically, using data on the inhibition of the protease Thermolysin, we demonstrate that it is possible for an ILP program to construct good quantitative structure-activity models. We also comment on the relationship of this work to other recent developments in statistical relational learning. Editors: Tamás Horváth and Akihiro Yamamoto  相似文献   

2.
3.
Machine Learning - Efficient omission of symmetric solution candidates is essential for combinatorial problem-solving. Most of the existing approaches are instance-specific and focus on the...  相似文献   

4.
5.
One of the main issues when using inductive logic programming (ILP) in practice remain the long running times that are needed by ILP systems to induce the hypothesis. We explore the possibility of reducing the induction running times of systems that use asymmetric relative minimal generalisation (ARMG) by analysing the bottom clauses of examples that serve as inputs into the generalisation operator. Using the fact that the ARMG covers all of the examples and that it is a subset of the variabilization of one of the examples, we identify literals that cannot appear in the ARMG and remove them prior to computing the generalisation. We apply this procedure to the ProGolem ILP system and test its performance on several real world data sets. The experimental results show an average speedup of \(36\,\%\) compared to the base ProGolem system and \(12\,\%\) compared to ProGolem extended with caching, both without a decrease in the accuracy of the produced hypotheses. We also observe that the gain from using the proposed method varies greatly, depending on the structure of the data set.  相似文献   

6.
7.
Inductive logic programming (ILP) is a sub‐field of machine learning that provides an excellent framework for multi‐relational data mining applications. The advantages of ILP have been successfully demonstrated in complex and relevant industrial and scientific problems. However, to produce valuable models, ILP systems often require long running times and large amounts of memory. In this paper we address fundamental issues that have direct impact on the efficiency of ILP systems. Namely, we discuss how improvements in the indexing mechanisms of an underlying logic programming system benefit ILP performance. Furthermore, we propose novel data structures to reduce memory requirements and we suggest a new lazy evaluation technique to search the hypothesis space more efficiently. These proposals have been implemented in the April ILP system and evaluated using several well‐known data sets. The results observed show significant improvements in running time without compromising the accuracy of the models generated. Indeed, the combined techniques achieve several order of magnitudes speedup in some data sets. Moreover, memory requirements are reduced in nearly half of the data sets. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

8.
In Inductive Logic Programming (ILP), algorithms that are purely of the bottom-up or top-down type encounter several problems in practice. Since a majority of them are greedy ones, these algorithms stop when finding clauses in local optima, according to the “quality” measure used for evaluating the results. Moreover, when learning clauses one by one, the induced clauses become less and less interesting as the algorithm is progressing to cover few remaining examples. In this paper, we propose a simulated annealing framework to overcome these problems. Using a refinement operator, we define neighborhood relations on clauses and on hypotheses (i.e. sets of clauses). With these relations and appropriate quality measures, we show how to induce clauses (in a coverage approach), or to induce hypotheses directly by using simulated annealing algorithms. We discuss the necessary conditions on the refinement operators and the evaluation measures to increase the effectiveness of the algorithm. Implementations (included a parallelized version of the algorithm) are described and experimentation results in terms of convergence of the method and in terms of accuracy are presented.  相似文献   

9.
Model transformation by example is a novel approach in model-driven software engineering to derive model transformation rules from an initial prototypical set of interrelated source and target models, which describe critical cases of the model transformation problem in a purely declarative way. In the current paper, we automate this approach using inductive logic programming (Muggleton and Raedt in J Logic Program 19-20:629–679, 1994) which aims at the inductive construction of first-order clausal theories from examples and background knowledge.
Dániel Varró (Corresponding author)Email:
  相似文献   

10.
A logical system of inference rules intended to give the foundation of logic programs is presented. The distinguished point of the approach taken here is the application of the theory of inductive definitions, which allows us to uniformly treat various kinds of induction schema and also allows us to regardnegation as failure as a kind of induction schema. This approach corresponds to the so-called least fixpoint semantics. Moreover, in our formalism, logic programs are extended so that a condition of a clause may be any first-order formula. This makes it possible to write a quantified specification as a logic program. It also makes the class of induction schemata much larger to include the usual course-of-values inductions.  相似文献   

11.
The bounded ILP-consistency problem for function-free Horn clauses is described as follows. Given at setE + andE ? of function-free ground Horn clauses and an integerk polynomial inE +E ?, does there exist a function-free Horn clauseC with no more thank literals such thatC subsumes each element inE + andC does not subsume any element inE ?? It is shown that this problem is Σ 2 P complete. We derive some related results on the complexity of ILP and discuss the usefulness of such complexity results.  相似文献   

12.
《Computers & chemistry》2002,26(1):57-64
Inductive logic programming (ILP) has been applied to automatically discover protein fold signatures. This paper investigates the use of topological information to circumvent problems encountered during previous experiments, namely (1) matching of non-structurally related secondary structures and (2) scaling problems. Cross-validation tests were carried out for 20 folds. The overall estimated accuracy is 73.37±0.35%. The new representation allows us to process the complete set of examples, while previously it was necessary to sample the negative examples. Topological information is used in approximately 90% of the rules presented here. Information about the topology of a sheet is present in 63% of the rules. This set of rules presents characteristics of the overall architecture of the fold. In contrast, 26% of the rules contain topological information which is limited to the packing of a restricted number of secondary structures, as such, the later set resembles those found in our previous studies.  相似文献   

13.
Inductive logic programming (ILP) has been applied to automatically discover protein fold signatures. This paper investigates the use of topological information to circumvent problems encountered during previous experiments, namely (1) matching of non-structurally related secondary structures and (2) scaling problems. Cross-validation tests were carried out for 20 folds. The overall estimated accuracy is 73.37+/-0.35%. The new representation allows us to process the complete set of examples, while previously it was necessary to sample the negative examples. Topological information is used in approximately 90% of the rules presented here. Information about the topology of a sheet is present in 63% of the rules. This set of rules presents characteristics of the overall architecture of the fold. In contrast, 26% of the rules contain topological information which is limited to the packing of a restricted number of secondary structures, as such, the later set resembles those found in our previous studies.  相似文献   

14.
We describe a new approach to the application of stochastic search in Inductive Logic Programming (ILP). Unlike traditional approaches we do not focus directly on evolving logical concepts but our refinement-based approach uses the stochastic optimization process to iteratively adapt the initial working concept. Utilization of context-sensitive concept refinements (adaptations) helps the search operations to produce mostly syntactically correct concepts. It also enables using available background knowledge both for efficiently restricting the search space and for directing the search. Thereby, the search is more flexible, less problem-specific and the framework can be easily used with any stochastic search algorithm within ILP domain. Experimental results on several data sets verify the usefulness of this approach.  相似文献   

15.
16.
17.
It is argued that some symmetric structure in logic programs could be taken into account when implementing semantics in logic programming. This may enhance the declarative ability or expressive power of the semantics. The work presented here may be seen as representative examples along this line. The focus is on the derivation of negative information and some other classic semantic issues. We first define a permutation group associated with a given logic program. Since usually the canonical models used to reflect the common sense or intended meaning are minimal or completed models of the program, we expose the relationships between minimal models and completed models of the original program and its so-called G-reduced form newly-derived via the permutation group defined. By means of this G-reduced form, we introduce a rule to assume negative information termed G-CWA, which is actually a generalization of the GCWA. We also develop the notions of G-definite, G-hierarchical and G-stratified logic programs,  相似文献   

18.
In this paper we revise Muggleton’s theory of inverse entailment, which is the logical foundation of Progol, one of the most famous ILP systems. We first point out that the theory is incomplete in general. Secondly we prove that the theory is complete if the background knowledge given to the system is a ground reduced program, every training example is a ground unit clause, and the hypothesis space is the set of all definite clauses. The proof is obtained by showing that every ground reduced logic program is logically equivalent to the conjunction of all atoms in its least Herbrand model. As a corollary to this equivalence, we are finally able to improve the logical foundation of the GOLEM system.  相似文献   

19.
Inductive Logic Programming (ILP) combines rule-based and statistical artificial intelligence methods, by learning a hypothesis comprising a set of rules given background knowledge and constraints for the search space. We focus on extending the XHAIL algorithm for ILP which is based on Answer Set Programming and we evaluate our extensions using the Natural Language Processing application of sentence chunking. With respect to processing natural language, ILP can cater for the constant change in how we use language on a daily basis. At the same time, ILP does not require huge amounts of training examples such as other statistical methods and produces interpretable results, that means a set of rules, which can be analysed and tweaked if necessary. As contributions we extend XHAIL with (i) a pruning mechanism within the hypothesis generalisation algorithm which enables learning from larger datasets, (ii) a better usage of modern solver technology using recently developed optimisation methods, and (iii) a time budget that permits the usage of suboptimal results. We evaluate these improvements on the task of sentence chunking using three datasets from a recent SemEval competition. Results show that our improvements allow for learning on bigger datasets with results that are of similar quality to state-of-the-art systems on the same task. Moreover, we compare the hypotheses obtained on datasets to gain insights on the structure of each dataset.  相似文献   

20.
A system structure supporting parallel processing in general and parallel logic programming and expert system applications in particular is described. It is not based on special hardware but has rather been designed as an evolutionary extension to most existing machine architectures. It is aimed at parallel processing support for e.g. PROLOG as well as for expert system (shells) implemented in a general purpose language. A layered structure consisting of an extended machine interface and a macro language is chosen to support a range of various applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号