共查询到10条相似文献,搜索用时 46 毫秒
1.
Unifying Instance-Based and Rule-Based Induction 总被引:13,自引:0,他引:13
Several well-developed approaches to inductive learning low exist, but each has specific limitations that are hard to overcome. Multi-strategy learning attempts to tackle this problem combining multiple methods in one algorithm. This article describes a unification of two widely-used empirical approaches: rule induction and instance-based learning. In the new algorithm, instances are treated as maximally specific rules, and classification is oerformed using a best-match strategy. Rules are learned by gradually generalizing instances until no improvement in apparent accuracy is obtained. Theoretical analysis shows this approach to be efficient. It is implemented in the RISE 3.1 system. In an extensive empirical study, RISE consistently achieves higher accuracies than state-of-the-art representatives of both its parent approaches (PEBLS and CN2), as well as a decision tree learner (C4.5). Lesion studies show that eachoof RISE's components is essential to this performance. Most significantly, in 14 of the 30 domains studied, RISE is more accurate than the best of PEBLS and CN2, showing that a significant synergy can be obtained by combining multiple empirical methods. 相似文献
2.
近年来,机器学习技术常被用于分析心理学数据,以期从数据中找出有价值的模式,更好地刻画和调整人们的心理行为。提出采用二次学习风范的规则生成算法,结合规则学习算法的在模式理解性方面的优势和集成学习、支持向量机等高性能算法在泛化性能上的优势,从心理学数据中发现准确且易于理解的模式。实验表明,采用二次学习风范的规则生成算法在泛化性能上显著高于传统的规则生成算法,且在许多情况下,其输出规则的可理解性亦优于传统的规则生成算法。 相似文献
3.
基于粗糙集的不完备信息系统规则推理算法 总被引:6,自引:0,他引:6
定义了非对称相似关系的近似集概念,提出了一种利用非对称相似关系下近似集和属性值对的基于粗糙集的确定规则推理算法.该算法无需改变初始不完备信息系统的结构,能直接处理缺省数据.实验结果表明,所获得的确定决策规则简洁、高效,与缺省值无关. 相似文献
4.
This paper addresses an important problem related to the use ofinduction systems in analyzing real world data. The problem is thequality and reliability of the rules generated by the systems.~Wediscuss the significance of having a reliable and efficient rule quality measure. Such a measure can provide useful support ininterpreting, ranking and applying the rules generated by aninduction system. A number of rule quality and statistical measuresare selected from the literature and their performance is evaluatedon four sets of semiconductor data. The primary goal of thistesting and evaluation has been to investigate the performance ofthese quality measures based on: (i) accuracy, (ii) coverage, (iii)positive error ratio, and (iv) negative error ratio of the ruleselected by each measure. Moreover, the sensitivity of these qualitymeasures to different data distributions is examined. Inconclusion, we recommend Cohens statistic as being the best qualitymeasure examined for the domain. Finally, we explain some future workto be done in this area. 相似文献
5.
6.
通过数据库和Web日志构建概念层次树,在继承FP算法思想的基础上,提出了由概念层次树挖掘多层包括交叉层次的关联规则算法。实验结果表明,该算法在性能上比传统算法有了较大的改善,能为客户提供多层次的关联推荐和电子商务的个性化服务。 相似文献
7.
Our main objective was to compare two discretization techniques, both based on cluster analysis, with a new rule induction algorithm called MLEM2, in which discretization is performed simultaneously with rule induction. The MLEM2 algorithm is an extension of the existing LEM2 rule induction algorithm. The LEM2 algorithm works correctly only for symbolic attributes and is a part of the LERS data mining system. For the two strategies, based on cluster analysis, rules were induced by the LEM2 algorithm. Our results show that MLEM2 outperformed both strategies based on cluster analysis, in terms of complexity (size of rule sets) and, more importantly, error rates. 相似文献
8.
Srinivasan Ragothaman Bijayananda Naik Kumoli Ramakrishnan 《Information Systems Frontiers》2003,5(4):401-412
Artificial Intelligence (AI)-based rule induction techniques such as IXL and ID3 are powerful tools that can be used to classify firms as acquisition candidates or not, based on financial and other data. The purpose of this paper is to develop an expert system that employs uncertainty representation and predicts acquisition targets. We outline in this paper, the features of IXL, a machine learning technique that we use to induce rules. We also discuss how uncertainty is handled by IXL and describe the use of confidence factors. Rules generated by IXL are incorporated into a prototype expert system, ACQTARGET, which evaluates corporate acquisitions. The use of confidence factors in ACQTARGET allows investors to specifically incorporate uncertainties into the decision making process. A set of training examples comprising 65 acquired and 65 non-acquired real world firms is used to generate the rules and a separate holdout sample containing 32 acquired and 32 non-acquired real world firms is used to validate the expert system results. The performance of the expert system is also compared with a conventional discriminant analysis model and a logit model using the same data. The results show that the expert system, ACQTARGET, performs as well as the statistical models and is a useful evaluation tool to classify firms into acquisition and non-acquisition target categories. This rule induction technique can be a valuable decision aid to help financial analysts and investors in their buy/sell decisions. 相似文献
9.
This paper has two major parts. The first is an extensive analysis of the problem of induction, and the second part is a detailed study of selective induction. Throughout the paper we integrate a number of notions, mainly from artificial intelligence, but also from pattern recognition and cognitive psychology. The result is a synthetic view which exploits uncertainty, task-guidance, and biases such as language restriction. Some of the main themes and contributions are as follows. (1) Practical induction is really a problem of efficacy and efficiency (power). (2) Search in a space of hypothetical concepts is governed by a credibility function which combines various knowledge sources in a single subjective probability or belief measure . (3) The amount of knowledge supplied by various sources can often be quantified; these sources include various biases and the learning system itself. (4) Induction is equivalent to discovery of a utility function u, which captures the purpose or goal of induction. (5) The difficulty of induction may be characterized by the form of u. Smooth or coherent functions mean selective induction, which has had the most attention in machine learning. (6) Systems for selective induction are more similar than commonly understood. By juxtaposing them we can discover similarities and improvements. (7) Our analysis suggests a number of incipient principles for powerful induction. 相似文献
10.
Competition-Based Induction of Decision Models from Examples 总被引:5,自引:0,他引:5
Symbolic induction is a promising approach to constructing decision models by extracting regularities from a data set of examples. The predominant type of model is a classification rule (or set of rules) that maps a set of relevant environmental features into specific categories or values. Classifying loan risk based on borrower profiles, consumer choice from purchase data, or supply levels based on operating conditions are all examples of this type of model-building task. Although current inductive approaches, such as ID3 and CN2, perform well on certain problems, their potential is limited by the incremental nature of their search. Genetic algorithms (GA) have shown great promise on complex search domains, and hence suggest a means for overcoming these limitations. However, effective use of genetic search in this context requires a framework that promotes the fundamental model-building objectives of predictive accuracy and model simplicity. In this article we describe COGIN, a GA-based inductive system that exploits the conventions of induction from examples to provide this framework. The novelty of COGIN lies in its use of training set coverage to simultaneously promote competition in various classification niches within the model and constrain overall model complexity. Experimental comparisons with NewID and CN2 provide evidence of the effectiveness of the COGIN framework and the viability of the GA approach. 相似文献