期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Data Mining: From Procedural to Declarative Approaches

Hendrik Blockeel 《New Generation Computing》2015,33(2):115-135

相似文献

2.

Learning directed probabilistic logical models: ordering-search versus structure-search

Daan Fierens Jan Ramon Maurice Bruynooghe Hendrik Blockeel 《Annals of Mathematics and Artificial Intelligence》2008,54(1-3):99-133

We discuss how to learn non-recursive directed probabilistic logical models from relational data. This problem has been tackled before by upgrading the structure-search algorithm initially proposed for Bayesian networks. In this paper we show how to upgrade another algorithm for learning Bayesian networks, namely ordering-search. For Bayesian networks, ordering-search was found to work better than structure-search. It is non-obvious that these results carry over to the relational case, however, since there ordering-search needs to be implemented quite differently. Hence, we perform an experimental comparison of these upgraded algorithms on four relational domains. We conclude that also in the relational case ordering-search is competitive with structure-search in terms of quality of the learned models, while ordering-search is significantly faster. 相似文献

3.

Diterpene structure elucidation from 13cnmr spectra with inductive logic programming

Saso Dzeroski Steffen Schulze-Kremer Karsten R. Heidtke Karsten Siems Dietrich Wettschereck Hendrik Blockeel 《Applied Artificial Intelligence》2013,27(5):363-383

We present a novel application ofInductive Logic Programming (ILP) to the problem of diterpene structure elucidation from 13 CNMR spectra. Diterpenes are organic compounds oflow molecular weight with a skeleton of 20 carbon atoms. They are of significant chemical and commercial interest because oftheir use as lead compounds in the search for new pharmaceutical effectors. The interpretation of diterpene 13 CNMR spectra normally requires specialists with detailed spectroscopic knowledge and substantial experience in natural products chemistry, specifically knowledge on peak patterns and chemical structures. Given a database ofpeak patterns for diterpenes with known structure, we apply several ILP approaches to discover correlations between peak patterns and chemical structure. The approaches used include first - order inductive learning, relational instance based learning, induction oflogical decision trees, and inductive constraint logic. Performance close to that of domain experts is achieved, which suffices for practical use. 相似文献

4.

Generalized ordering-search for learning directed probabilistic logical models

Jan Ramon Tom Croonenborghs Daan Fierens Hendrik Blockeel Maurice Bruynooghe 《Machine Learning》2008,70(2-3):169-188

Recently, there has been an increasing interest in directed probabilistic logical models and a variety of formalisms for describing such models has been proposed. Although many authors provide high-level arguments to show that in principle models in their formalism can be learned from data, most of the proposed learning algorithms have not yet been studied in detail. We introduce an algorithm, generalized ordering-search, to learn both structure and conditional probability distributions (CPDs) of directed probabilistic logical models. The algorithm is based on the ordering-search algorithm for Bayesian networks. We use relational probability trees as a representation for the CPDs. We present experiments on a genetics domain, blocks world domains and the Cora dataset. Editors: Stephen Muggleton, Ramon Otero, Simon Colton. 相似文献

5.

Isidd: An interactive system for inductive database design

Hendrik Blockeel Luc DE Raedt 《Applied Artificial Intelligence》2013,27(5):385-420

When designing a deductive database, the designer has to decide for each predicate (or relation) whether it should be defined extensionally or intensionally and what the definition should look like. An intelligent interactive system is presented to assist the designer in this task. It starts from an example state ofa database in which all predicates are defined extensionally. It can then compact the database by transforming extensionally defined predicates into intensionally defined ones. These predicates can be chosen by the user or by the system itself. Further compaction is possible by inventing new predicates; this invention is controlled by user-specified templates. The systemalso proposes semantic integrity constraints to the user. These do not lead to extra compaction but can be used to make the database more robust. The intelligent system employs techniques from the area of inductive logic programming. 相似文献

6.

Decision trees for hierarchical multi-label classification 总被引：3，自引：0，他引：3

Celine Vens Jan Struyf Leander Schietgat Sašo Džeroski Hendrik Blockeel 《Machine Learning》2008,73(2):185-214

Hierarchical multi-label classification (HMC) is a variant of classification where instances may belong to multiple classes at the same time and these classes are organized in a hierarchy. This article presents several approaches to the induction of decision trees for HMC, as well as an empirical study of their use in functional genomics. We compare learning a single HMC tree (which makes predictions for all classes together) to two approaches that learn a set of regular classification trees (one for each class). The first approach defines an independent single-label classification task for each class (SC). Obviously, the hierarchy introduces dependencies between the classes. While they are ignored by the first approach, they are exploited by the second approach, named hierarchical single-label classification (HSC). Depending on the application at hand, the hierarchy of classes can be such that each class has at most one parent (tree structure) or such that classes may have multiple parents (DAG structure). The latter case has not been considered before and we show how the HMC and HSC approaches can be modified to support this setting. We compare the three approaches on 24 yeast data sets using as classification schemes MIPS’s FunCat (tree structure) and the Gene Ontology (DAG structure). We show that HMC trees outperform HSC and SC trees along three dimensions: predictive accuracy, model size, and induction time. We conclude that HMC trees should definitely be considered in HMC tasks where interpretable models are desired. 相似文献

7.

An expressive dissimilarity measure for relational clustering using neighbourhood trees

Sebastijan Dumančić Hendrik Blockeel 《Machine Learning》2017,106(9-10):1523-1545

Clustering is an underspecified task: there are no universal criteria for what makes a good clustering. This is especially true for relational data, where similarity can be based on the features of individuals, the relationships between them, or a mix of both. Existing methods for relational clustering have strong and often implicit biases in this respect. In this paper, we introduce a novel dissimilarity measure for relational data. It is the first approach to incorporate a wide variety of types of similarity, including similarity of attributes, similarity of relational context, and proximity in a hypergraph. We experimentally evaluate the proposed dissimilarity measure on both clustering and classification tasks using data sets of very different types. Considering the quality of the obtained clustering, the experiments demonstrate that (a) using this dissimilarity in standard clustering methods consistently gives good results, whereas other measures work well only on data sets that match their bias; and (b) on most data sets, the novel dissimilarity outperforms even the best among the existing ones. On the classification tasks, the proposed method outperforms the competitors on the majority of data sets, often by a large margin. Moreover, we show that learning the appropriate bias in an unsupervised way is a very challenging task, and that the existing methods offer a marginal gain compared to the proposed similarity method, and can even hurt performance. Finally, we show that the asymptotic complexity of the proposed dissimilarity measure is similar to the existing state-of-the-art approaches. The results confirm that the proposed dissimilarity measure is indeed versatile enough to capture relevant information, regardless of whether that comes from the attributes of vertices, their proximity, or connectedness of vertices, even without parameter tuning. 相似文献

8.

An inductive database system based on virtual mining views

Hendrik Blockeel Toon Calders élisa Fromont Bart Goethals Adriana Prado Céline Robardet 《Data mining and knowledge discovery》2012,24(1):247-287

Inductive databases integrate database querying with database mining. In this article, we present an inductive database system that does not rely on a new data mining query language, but on plain SQL. We propose an intuitive and elegant framework based on virtual mining views, which are relational tables that virtually contain the complete output of data mining algorithms executed over a given data table. We show that several types of patterns and models that are implicitly present in the data, such as itemsets, association rules, and decision trees, can be represented and queried with SQL using a unifying framework. As a proof of concept, we illustrate a complete data mining scenario with SQL queries over the mining views, which is executed in our system. 相似文献

9.

Guest editorial to the special issue on inductive logic programming,mining and learning in graphs and statistical relational learning

Hendrik Blockeel Karsten Borgwardt Luc De Raedt Pedro Domingos Kristian Kersting Xifeng Yan 《Machine Learning》2011,83(2):133-135

相似文献

10.

Experiment databases

Joaquin Vanschoren Hendrik Blockeel Bernhard Pfahringer Geoffrey Holmes 《Machine Learning》2012,87(2):127-158

Thousands of machine learning research papers contain extensive experimental comparisons. However, the details of those experiments are often lost after publication, making it impossible to reuse these experiments in further research, or reproduce them to verify the claims made. In this paper, we present a collaboration framework designed to easily share machine learning experiments with the community, and automatically organize them in public databases. This enables immediate reuse of experiments for subsequent, possibly much broader investigation and offers faster and more thorough analysis based on a large set of varied results. We describe how we designed such an experiment database, currently holding over 650,000 classification experiments, and demonstrate its use by answering a wide range of interesting research questions and by verifying a number of recent studies. 相似文献