首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Credit Assignment in Rule Discovery Systems Based on Genetic Algorithms   总被引:5,自引:0,他引:5  
In rule discovery systems, learning often proceeds by first assessing the quality of the system's current rules and then modifying rules based on that assessment. This paper addresses the credit assignment problem that arises when long sequences of rules fire between successive external rewards. The focus is on the kinds of rule assessment schemes which have been proposed for rule discovery systems that use genetic algorithms as the primary rule modification strategy. Two distinct approaches to rule learning with genetic algorithms have been previously reported, each approach offering a useful solution to a different level of the credit assignment problem. We describe a system, called RUDI, that exploits both approaches. We present analytic and experimental results that support the hypothesis that multiple levels of credit assignment can improve the performance of rule learning systems based on genetic algorithms.  相似文献   

2.
Artificial Intelligence (AI)-based rule induction techniques such as IXL and ID3 are powerful tools that can be used to classify firms as acquisition candidates or not, based on financial and other data. The purpose of this paper is to develop an expert system that employs uncertainty representation and predicts acquisition targets. We outline in this paper, the features of IXL, a machine learning technique that we use to induce rules. We also discuss how uncertainty is handled by IXL and describe the use of confidence factors. Rules generated by IXL are incorporated into a prototype expert system, ACQTARGET, which evaluates corporate acquisitions. The use of confidence factors in ACQTARGET allows investors to specifically incorporate uncertainties into the decision making process. A set of training examples comprising 65 acquired and 65 non-acquired real world firms is used to generate the rules and a separate holdout sample containing 32 acquired and 32 non-acquired real world firms is used to validate the expert system results. The performance of the expert system is also compared with a conventional discriminant analysis model and a logit model using the same data. The results show that the expert system, ACQTARGET, performs as well as the statistical models and is a useful evaluation tool to classify firms into acquisition and non-acquisition target categories. This rule induction technique can be a valuable decision aid to help financial analysts and investors in their buy/sell decisions.  相似文献   

3.
在当今大数据时代下,数据质量的保证是大数据价值得以发挥的前提,数据质量的评估是其中一个重要的研究课题.本文基于规则库的数据质量评估方法,提出了数据质量评估整体模型,包括规则、规则库、数据质量评估指标、评估模板、评估报告.设计了规则评估模板,组合规则库中的规则,根据数据质量评估指标的重要性设置规则的权重,采用简单比率法和加权平均法相结合的评估方法,计算评估结果并确定数据质量的等级,利用了数据可视化技术来展现数据质量的评估结果.本文既考虑了单个规则的执行合格率,又考虑了各规则在数据质量评估模板中的比重,公正地准确地评估数据质量,并且简洁、直观地呈现评估结果.  相似文献   

4.
Mining Informative Rule Set for Prediction   总被引:2,自引:0,他引:2  
Mining transaction databases for association rules usually generates a large number of rules, most of which are unnecessary when used for subsequent prediction. In this paper we define a rule set for a given transaction database that is much smaller than the association rule set but makes the same predictions as the association rule set by the confidence priority. We call this rule set informative rule set. The informative rule set is not constrained to particular target items; and it is smaller than the non-redundant association rule set. We characterise relationships between the informative rule set and non-redundant association rule set. We present an algorithm to directly generate the informative rule set without generating all frequent itemsets first that accesses the database less frequently than other direct methods. We show experimentally that the informative rule set is much smaller and can be generated more efficiently than both the association rule set and non-redundant association rule set.  相似文献   

5.
Rule induction is an important part of learning in expert systems. Rules can help managers make more effective decisions and gain insight into the relationships between decision variables. We present a logic-based approach to rule induction in expert systems which is simple, robust and consistent. We also derive bounds on levels of certainty for combining rules. We apply our approach to the development of rules for the entry decisions of new products. We then discuss how the logic-based approach of rule induction can be used to create a decision support system and the methodology to create such a system.  相似文献   

6.
Abstract

As today’s manufacturing domain is becoming more and more knowledge-intensive, knowledge-based systems (KBS) are widely applied in the predictive maintenance domain to detect and predict anomalies in machines and machine components. Within a KBS, decision rules are a comprehensive and interpretable tool for classification and knowledge discovery from data. However, when the decision rules incorporated in a KBS are extracted from heterogeneous sources, they may suffer from several rule quality issues, which weakens the performance of a KBS. To address this issue, in this paper, we propose a rule base refinement approach with considering rule quality measures. The proposed approach is based on a rule integration method for integrating the expert rules and the rules obtained from data mining. Within the integration process, rule accuracy, coverage, redundancy, conflict, and subsumption are the quality measures that we use to refine the rule base. A case study on a real-world data set shows the approach in detail.  相似文献   

7.
Efficient Adaptive-Support Association Rule Mining for Recommender Systems   总被引:25,自引:0,他引:25  
Collaborative recommender systems allow personalization for e-commerce by exploiting similarities and dissimilarities among customers' preferences. We investigate the use of association rule mining as an underlying technology for collaborative recommender systems. Association rules have been used with success in other domains. However, most currently existing association rule mining algorithms were designed with market basket analysis in mind. Such algorithms are inefficient for collaborative recommendation because they mine many rules that are not relevant to a given user. Also, it is necessary to specify the minimum support of the mined rules in advance, often leading to either too many or too few rules; this negatively impacts the performance of the overall system. We describe a collaborative recommendation technique based on a new algorithm specifically designed to mine association rules for this purpose. Our algorithm does not require the minimum support to be specified in advance. Rather, a target range is given for the number of rules, and the algorithm adjusts the minimum support for each user in order to obtain a ruleset whose size is in the desired range. Rules are mined for a specific target user, reducing the time required for the mining process. We employ associations between users as well as associations between items in making recommendations. Experimental evaluation of a system based on our algorithm reveals performance that is significantly better than that of traditional correlation-based approaches.  相似文献   

8.
研究分析了现有关联规则分类算法,总结了一般关联规则分类存在的不足,提出了一个基于关联规则挖掘技术构造分类器的新方法。该方法解决了传统算法产生规则太多,分类模型难以理解的问题。  相似文献   

9.
物联网(IoT)数据具有数据量大和实时性好的特点。通过复杂事件处理技术处理物联网数据时需要设置复杂的规则,但规则往往会随着业务的变化而变化。Drools规则引擎可以通过分离的配置文件设置规则,在不修改设备数据或管理平台代码的情况下,筛选到匹配规则的数据或事物。为解决DRL规则文件架构和决策表文件架构筛选数据产生的重复匹配的冗余问题,设计了一种修正数据库的数据查询方法。在不同数据量、不同规则数的查询环境下,对DRL规则文件架构、决策表文件架构和修正数据库架构进行了数据查询性能分析。实验结果表明:修正数据库架构在大数据量查询时时间损耗更少,可有效减少冗余。  相似文献   

10.
This paper addresses an important problem related to the use ofinduction systems in analyzing real world data. The problem is thequality and reliability of the rules generated by the systems.~Wediscuss the significance of having a reliable and efficient rule quality measure. Such a measure can provide useful support ininterpreting, ranking and applying the rules generated by aninduction system. A number of rule quality and statistical measuresare selected from the literature and their performance is evaluatedon four sets of semiconductor data. The primary goal of thistesting and evaluation has been to investigate the performance ofthese quality measures based on: (i) accuracy, (ii) coverage, (iii)positive error ratio, and (iv) negative error ratio of the ruleselected by each measure. Moreover, the sensitivity of these qualitymeasures to different data distributions is examined. Inconclusion, we recommend Cohens statistic as being the best qualitymeasure examined for the domain. Finally, we explain some future workto be done in this area.  相似文献   

11.
为构建一种具有实时性的配电网监控信息智能分析规则库,提出了基于机器学习的配电网监控信息智能分析规则库构建方法。将规则库中全部配电网监控规则头排序并设成主链,将规则导进链表里生成规则集,保证各个监控信息数据包都存在一个分析规则。使用基于机器学习的配电网故障数据分类方法,识别配电网监控信息中的故障数据,并提取故障数据频繁项集。使用基于MapReduce的并行关联规则增量更新算法,更新分析规则库中的信息智能分析规则,保证分析规则库中的信息智能分析规则具有实时性。实验结果表明,所提方法的识别结果准确度、检出率均值都大于0.97,假阳性率都是0.01,可以及时识别出配电网监控系统实时检测故障信息,保证信息智能分析规则更新具有实时性。  相似文献   

12.
一种新的多维关联规则挖掘算法   总被引:12,自引:0,他引:12  
关联规则是数据挖掘中一个重要课题.文章给出一种基于遗传算法和蚂蚁算法相结合的多维关联规则挖掘算法.新算法利用了遗传和蚂蚁算法共有的良好全局搜索能力,并克服了遗传算法局部搜索能力弱和蚂蚁算法搜索速魔慢的缺陷.实验结果表明,新算法在对具有稀疏特性的多维关联规则的挖掘中体现了良好的性能.  相似文献   

13.
基于支持度和置信度模型的关联规则剪枝算法会挖掘出很多无趣规则。针对该问题,提出一种正相关性指导下的关联规则剪枝算法。利用全置信度和提升度构造一个正相关性评价函数,以此对频繁项集进行剪枝。实验结果表明,该算法能减少无趣关联规则数量,提升挖掘结果质量,缩短挖掘时间。  相似文献   

14.
Logic programs can be evaluated bottom-up by repeatedly applying all rules, in “iterations”, until the fixpoint is reached. However, it is often desirable-and, in some cases, e.g. programs with stratified negation, it is even necessary to guarantee the semantics-to apply the rules in some order. We present two algorithms that apply rules in a specified order without repeating inferences. One of them (GSN) is capable of dealing with a wide range of rule orderings, but with a little more overhead than the well-known seminaive algorithm (which we call BSN). The other (PSN) handles a smaller class of rule orderings, but with no overheads beyond those in BSN. We also demonstrate that by choosing a good ordering, we can reduce the number of rule applications (and thus the number of joins). We present a theoretical analysis of rule orderings and identify orderings that minimize the number of rule applications (for all possible instances of the base relations) with respect to a class of orderings called fair orderings. We also show that though nonfair orderings may do a little better on some data sets, they can do much worse on others. The analysis is supplemented by performance results  相似文献   

15.
Rule induction for uncertain data   总被引:1,自引:1,他引:0  
Data uncertainty are common in real-world applications and it can be caused by many factors such as imprecise measurements, network latency, outdated sources and sampling errors. When mining knowledge from these applications, data uncertainty need to be handled with caution. Otherwise, unreliable or even wrong mining results would be obtained. In this paper, we propose a rule induction algorithm, called uRule, to learn rules from uncertain data. The key problem in learning rules is to efficiently identify the optimal cut points from training data. For uncertain numerical data, we propose an optimization mechanism which merges adjacent bins that have equal classifying class distribution and prove its soundness. For the uncertain categorical data, we also propose a new method to select cut points based on possible world semantics. We then present the uRule algorithm in detail. Our experimental results show that the uRule algorithm can generate rules from uncertain numerical data with potentially higher accuracies, and the proposed optimization method is effective in the cut point selection for both certain and uncertain numerical data. Furthermore, uRule has quite stable performance when mining uncertain categorical data.  相似文献   

16.
Association rule mining has contributed to many advances in the area of knowledge discovery. However, the quality of the discovered association rules is a big concern and has drawn more and more attention recently. One problem with the quality of the discovered association rules is the huge size of the extracted rule set. Often for a dataset, a huge number of rules can be extracted, but many of them can be redundant to other rules and thus useless in practice. Mining non-redundant rules is a promising approach to solve this problem. In this paper, we first propose a definition for redundancy, then propose a concise representation, called a Reliable basis, for representing non-redundant association rules. The Reliable basis contains a set of non-redundant rules which are derived using frequent closed itemsets and their generators instead of using frequent itemsets that are usually used by traditional association rule mining approaches. An important contribution of this paper is that we propose to use the certainty factor as the criterion to measure the strength of the discovered association rules. Using this criterion, we can ensure the elimination of as many redundant rules as possible without reducing the inference capacity of the remaining extracted non-redundant rules. We prove that the redundancy elimination, based on the proposed Reliable basis, does not reduce the strength of belief in the extracted rules. We also prove that all association rules, their supports and confidences, can be retrieved from the Reliable basis without accessing the dataset. Therefore the Reliable basis is a lossless representation of association rules. Experimental results show that the proposed Reliable basis can significantly reduce the number of extracted rules. We also conduct experiments on the application of association rules to the area of product recommendation. The experimental results show that the non-redundant association rules extracted using the proposed method retain the same inference capacity as the entire rule set. This result indicates that using non-redundant rules only is sufficient to solve real problems needless using the entire rule set.  相似文献   

17.
一个最优分类关联规则算法   总被引:1,自引:0,他引:1  
分类和关联规则发现是数据挖掘中的两个重要领域。使用关联规则算法挖掘分类规则被叫做分类关联规则算法,是一个有较好前景的方法。本文提出了一个最优分类关联规则算法——OCARA。该算法使用最优关联规则挖掘算法挖掘分类规则,并对最优规则集排序,从而获得一个分类精度较高的分类器。将OCARA与传统分类算法C4.5和一般分类关联规则算法CBA、RMR在8个UCI数据集上进行实验比较,结果显示OCARA具有更好的性能,证明OCARA是一个有效的分类关联规则挖掘算法。  相似文献   

18.
城市公共交通服务质量评价知识规则是城市公共交通企业进行服务质量评价的重要依据,优质、合理的评价知识规则将使服务质量评价更加公正、更加客观。本文在分析城市公共交通服务质量评价指标体系的基础上,将一种改进的遗传算法用于城市公共交通服务质量评价价的知识规则挖掘,提出一种基于遗传算法的城市公共交通服务质量评价知识规则挖掘方法,阐述算法的实现途径。实例表明,该方法在进行知识规则挖掘时是完全可行的、有效的。  相似文献   

19.
We present ELEM2, a machine learning system that induces classification rules from a set of data based on a heuristic search over a hypothesis space. ELEM2 is distinguished from other rule induction systems in three aspects. First, it uses a new heuristtic function to guide the heuristic search. The function reflects the degree of relevance of an attribute-value pair to a target concept and leads to selection of the most relevant pairs for formulating rules. Second, ELEM2 handles inconsistent training examples by defining an unlearnable region of a concept based on the probability distribution of that concept in the training data. The unlearnable region is used as a stopping criterion for the concept learning process, which resolves conflicts without removing inconsistent examples. Third, ELEM2 employs a new rule quality measure in its post-pruning process to prevent rules from overfitting the data. The rule quality formula measures the extent to which a rule can discriminate between the positive and negative examples of a class. We describe features of ELEM2, its rule induction algorithm and its classification procedure. We report experimental results that compare ELEM2 with C4.5 and CN2 on a number of datasets.  相似文献   

20.
对于身份认证机制中的安全字符串恢复,字典结合变换规则是一种常用的方法。通过变换规则的处理,可以快速生成大量具有针对性的新字符串供验证使用。但是,规则的处理过程复杂,对处理性能、系统功耗等有很高的要求,现有的工具和研究都是基于软件方式进行处理,难以满足实际恢复系统的需求。为此,文中提出了基于异构计算平台的规则处理器技术,首次使用可重构FPGA硬件加速规则的处理过程,同时使用ARM通用计算核心进行规则处理过程的配置、管理、监控等工作,并在Xilinx Zynq XC7Z030芯片上进行了具体实现。实验结果表明,在典型情况下,该混合架构的规则处理器相比于单纯使用ARM通用计算核心,性能提升了214倍,规则处理器的运行性能优于Intel i7-6700 CPU,性能功耗比相比NVIDIA GeForce GTX 1080 Ti GPU有1.4~2.1倍的提升,相比CPU有70倍的提升,有效提升了规则处理的速率和能效。实验数据充分说明,基于异构计算平台,采用硬件加速的规则处理器有效解决了规则处理中的速率和能效问题,可以满足实际工程需求,为整个安全字符串恢复系统的设计奠定了基础。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号