共查询到10条相似文献,搜索用时 62 毫秒
1.
Naive Bayesian Classification of Structured Data 总被引:3,自引:0,他引:3
2.
Relational Instance-Based Learning with Lists and Terms 总被引:3,自引:0,他引:3
The similarity measures used in first-order IBL so far have been limited to the function-free case. In this paper we show that a lot of power can be gained by allowing lists and other terms in the input representation and designing similarity measures that work directly on these structures. We present an improved similarity measure for the first-order instance-based learner ribl that employs the concept of edit distances to efficiently compute distances between lists and terms, discuss its computational and formal properties, and empirically demonstrate its additional power on a problem from the domain of biochemistry. The paper also includes a thorough reconstruction of ribl's overall algorithm. 相似文献
3.
复杂结构归纳学习的需求近年来快速增长。复杂结构归纳学习方法按照知识表示方式不同分为基于逻辑的方法与基于数学图的方法。阐述了复杂结构归纳学习研究的历史沿革,介绍、分析和对比了不同知识表示方式下的学习方法,给出了复杂结构归纳学习将来发展面临的挑战和需重点解决的问题。 相似文献
4.
5.
Separate-and-Conquer Rule Learning 总被引:9,自引:0,他引:9
Johannes Fürnkranz 《Artificial Intelligence Review》1999,13(1):3-54
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases. 相似文献
6.
7.
大多数电力系统都存有年金托管机构的基本信息,但目前由于托管机构的投资的实效性和市场的约束性导致各投资机构的投资信息独立,且投资信息的保密程度过高,往往只能通过内网邮件的形式交互,使得年金专责无法在短时间内对托管机构的投资方向和投资利润做对比,以至于无法比较托管机构的优劣。在邮件信息中往往存在大量的且关键的信息,基于对结构化数据的模糊识别与算法,并根据定价日、科目名称、成本、市值建立数据模型,实现重要信息的分类处理,解决了投资信息的实时录入和对托管机构营收的准确判断。 相似文献
8.
Extracting Web Data Using Instance-Based Learning 总被引:1,自引:0,他引:1
This paper studies structured data extraction from Web pages. Existing approaches to data extraction include wrapper induction
and automated methods. In this paper, we propose an instance-based learning method, which performs extraction by comparing
each new instance to be extracted with labeled instances. The key advantage of our method is that it does not require an initial
set of labeled pages to learn extraction rules as in wrapper induction. Instead, the algorithm is able to start extraction
from a single labeled instance. Only when a new instance cannot be extracted does it need labeling. This avoids unnecessary
page labeling, which solves a major problem with inductive learning (or wrapper induction), i.e., the set of labeled instances
may not be representative of all other instances. The instance-based approach is very natural because structured data on the
Web usually follow some fixed templates. Pages of the same template usually can be extracted based on a single page instance
of the template. A novel technique is proposed to match a new instance with a manually labeled instance and in the process
to extract the required data items from the new instance. The technique is also very efficient. Experimental results based
on 1,200 pages from 24 diverse Web sites demonstrate the effectiveness of the method. It also outperforms the state-of-the-art
existing systems significantly. 相似文献
9.
This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, multi-relational knowledge. Possible applications include first-order knowledge discovery, induction of integrity constraints in databases, multiple predicate learning, and learning mixed theories of predicate definitions and integrity constraints. One of the contributions of our work is a heuristic measure of confirmation, trading off novelty and satisfaction of the rule. The approach has been implemented in the Tertius system. The system performs an optimal best-first search, finding the k most confirmed hypotheses, and includes a non-redundant refinement operator to avoid duplicates in the search. Tertius can be adapted to many different domains by tuning its parameters, and it can deal either with individual-based representations by upgrading propositional representations to first-order, or with general logical rules. We describe a number of experiments demonstrating the feasibility and flexibility of our approach. 相似文献