首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Abstract: The concept of fuzzy sets is one of the most fundamental and influential tools in the development of computational intelligence. In this paper the fuzzy pincer search algorithm is proposed. It generates fuzzy association rules by adopting combined top-down and bottom-up approaches. A fuzzy grid representation is used to reduce the number of scans of the database and our algorithm trims down the number of candidate fuzzy grids at each level. It has been observed that fuzzy association rules provide more realistic visualization of the knowledge extracted from databases.  相似文献   

Association rule mining is one of most popular data analysis methods that can discover associations within data. Association rule mining algorithms have been applied to various datasets, due to their practical usefulness. Little attention has been paid, however, on how to apply the association mining techniques to analyze questionnaire data. Therefore, this paper first identifies the various data types that may appear in a questionnaire. Then, we introduce the questionnaire data mining problem and define the rule patterns that can be mined from questionnaire data. A unified approach is developed based on fuzzy techniques so that all different data types can be handled in a uniform manner. After that, an algorithm is developed to discover fuzzy association rules from the questionnaire dataset. Finally, we evaluate the performance of the proposed algorithm, and the results indicate that our method is capable of finding interesting association rules that would have never been found by previous mining algorithms.  相似文献   

Mining fuzzy association rules from uncertain data   总被引:3,自引:3,他引:0  
Association rule mining is an important data analysis method that can discover associations within data. There are numerous previous studies that focus on finding fuzzy association rules from precise and certain data. Unfortunately, real-world data tends to be uncertain due to human errors, instrument errors, recording errors, and so on. Therefore, a question arising immediately is how we can mine fuzzy association rules from uncertain data. To this end, this paper proposes a representation scheme to represent uncertain data. This representation is based on possibility distributions because the possibility theory establishes a close connection between the concepts of similarity and uncertainty, providing an excellent framework for handling uncertain data. Then, we develop an algorithm to mine fuzzy association rules from uncertain data represented by possibility distributions. Experimental results from the survey data show that the proposed approach can discover interesting and valuable patterns with high certainty.  相似文献   

Data mining is most commonly used in attempts to induce association rules from databases which can help decision-makers easily analyze the data and make good decisions regarding the domains concerned. Different studies have proposed methods for mining association rules from databases with crisp values. However, the data in many real-world applications have a certain degree of imprecision. In this paper we address this problem, and propose a new data-mining algorithm for extracting interesting knowledge from databases with imprecise data. The proposed algorithm integrates imprecise data concepts and the fuzzy apriori mining algorithm to find interesting fuzzy association rules in given databases. Experiments for diagnosing dyslexia in early childhood were made to verify the performance of the proposed algorithm.  相似文献   

In association rule mining, the trade-off between avoiding harmful spurious rules and preserving authentic ones is an ever critical barrier to obtaining reliable and useful results. The statistically sound technique for evaluating statistical significance of association rules is superior in preventing spurious rules, yet can also cause severe loss of true rules in presence of data error. This study presents a new and improved method for statistical test on association rules with uncertain erroneous data. An original mathematical model was established to describe data error propagation through computational procedures of the statistical test. Based on the error model, a scheme combining analytic and simulative processes was designed to correct the statistical test for distortions caused by data error. Experiments on both synthetic and real-world data show that the method significantly recovers the loss in true rules (reduces type-2 error) due to data error occurring in original statistically sound method. Meanwhile, the new method maintains effective control over the familywise error rate, which is the distinctive advantage of the original statistically sound technique. Furthermore, the method is robust against inaccurate data error probability information and situations not fulfilling the commonly accepted assumption on independent error probabilities of different data items. The method is particularly effective for rules which were most practically meaningful yet sensitive to data error. The method proves promising in enhancing values of association rule mining results and helping users make correct decisions.  相似文献   

挖掘Web日志中的分类关联规则   总被引:1,自引:0,他引:1       下载免费PDF全文
用户分类是Web访问模式挖掘研究的一个重要任务。提出一种应用关联分类技术对Web用户进行分类的方法:首先通过对Web日志文件预处理得到训练事务数据集,然后从该事务集中挖掘分类关联规则,并利用所挖掘的规则集构建了一个分类器,从而实现了根据用户访问历史对用户进行分类。  相似文献   

借助模糊概念和模糊运算,对时间区间的描述很容易实现。对于指定的日历模式,不同的时间区间可根据它们的隶属度具有不同的权重。在模糊日历代数基础上,结合增量挖掘和累进计数的思想,提出了一种基于模糊日历的模糊时序关联规则挖掘方法。理论分析和实验结果均表明,该算法是高效可行的。  相似文献   

We develop techniques for discovering patterns with periodicity in this work. Patterns with periodicity are those that occur at regular time intervals, and therefore there are two aspects to the problem: finding the pattern, and determining the periodicity. The difficulty of the task lies in the problem of discovering these regular time intervals, i.e., the periodicity. Periodicities in the database are usually not very precise and have disturbances, and might occur at time intervals in multiple time granularities. To overcome these difficulties and to be able to discover the patterns with fuzzy periodicity, we propose the fuzzy periodic calendar which defines fuzzy periodicities. Furthermore, we develop algorithms for mining fuzzy periodicities and the fuzzy periodic association rules within them. Experimental results have shown that our method is effective in discovering fuzzy periodic association rules.  相似文献   

高置信度关联规则的挖掘   总被引:3,自引:1,他引:2       下载免费PDF全文
传统的关联规则和基于效用的关联规则,会忽略一些支持度或效用值不高、置信度(又称可信度)却非常高的规则,这些置信度很高的规则能帮助人们满足规避风险、提高成功率的期望。为挖掘这些低支持度(或效用值)、高置信度的规则,提出了HCARM算法。HCARM采用了划分的方法来处理大数据集,利用新的剪枝策略压缩搜索空间。同时,通过设定长度阈值minlen,使HCARM适合长模式挖掘。实验结果表明,该方法对高置信度长模式有效。  相似文献   

Temporal data mining is still one of important research topic since there are application areas that need knowledge from temporal data such as sequential patterns, similar time sequences, cyclic and temporal association rules, and so on. Although there are many studies for temporal data mining, they do not deal with discovering knowledge from temporal interval data such as patient histories, purchaser histories, and web logs etc. We propose a new temporal data mining technique that can extract temporal interval relation rules from temporal interval data by using Allen’s theory: a preprocessing algorithm designed for the generalization of temporal interval data and a temporal relation algorithm for mining temporal relation rules from the generalized temporal interval data. This technique can provide more useful knowledge in comparison with conventional data mining techniques.  相似文献   

XML凭借其诸多优点,在短短的时间内迅速成为表示和交换信息的标准。大量XML数据的涌现给数据挖掘提出了新的挑战。传统关联规则挖掘是基于关系数据库的,因此现有许多XML数据关联规则挖掘的方法都或多或少地利用关系数据库-即把XML数据文档映射成关系数据库来完成的。在仔细研究了XML数据的访问接口后,给出了一个基于Apriori算法可直接从XML文档挖掘关联规则的类接口,并且在.NET平台下用C#语言实现了。  相似文献   

矢量空间数据库中关联规则的挖掘算法研究   总被引:2,自引:0,他引:2  
按照矢量空间数据的特点和空间数据挖掘的要求,以GIS的空间分析和空间数据处理为工具,探讨了矢量空间数据库中关联规则挖掘的数据处理方法,提出了关联规则的挖掘算法。最后,通过实例进行了验证。  相似文献   

数据库中动态关联规则的挖掘   总被引:7,自引:0,他引:7  
关联规则能挖掘变量间的相互依赖关系,但是不能反映规则本身的变化规律.为此本文提出了动态关联规则.首先将整个待挖掘数据集按时间划分成若干子集,每个子集挖掘得到的每条规则分别生成一个支持度和一个置信度,这样每条规则在全集上就对应了一个支持度向量和一个置信度向量.通过分析支持度向量和置信度向量,不仅可以发现规则随时间变化的情况,也能够预测规则的发展趋势.本文还提出了两个挖掘动态关联规则的算法,且对他们做了比较.并给出了柱状图和时间序列两种方法分析这两个向量.最后给出了一个挖掘动态关联规则的应用实例。  相似文献   

Mining dynamic association rules with comments   总被引:2,自引:2,他引:0  
In this paper, we study a new problem of mining dynamic association rules with comments (DAR-C for short). A DAR-C contains not only rule itself, but also its comments that specify when to apply the rule. In order to formalize this problem, we first present the expression method of candidate effective time slots, and then propose several definitions concerning DAR-C. Subsequently, two algorithms, namely ITS2 and EFP-Growth2, are developed for handling the problem of mining DAR-C. In particular, ITS2 is an improved two-stage dynamic association rule mining algorithm, while EFP-Growth2 is based on the EFP-tree structure and is suitable for mining high-density mass data. Extensive experimental results demonstrate that the efficiency and scalability of our proposed two algorithms (i.e., ITS2 and EFP-Growth2) on DAR-C mining tasks, and their practicability on real retail dataset.  相似文献   

In this paper, we examine a new data mining issue of mining association rules from customer databases and transaction databases. The problem is decomposed into two subproblems: identifying all the large itemsets from the transaction database and mining association rules from the customer database and the large itemsets identified. For the first subproblem, we propose an efficient algorithm to discover all the large itemsets from the transaction database. Experimental results show that by our approach, the total execution time can be reduced significantly. For the second subproblem, a relationship graph is constructed according to the identified large itemsets from the transaction database and the priorities of condition attributes from the customer database. Based on the relationship graph, we present an efficient graph-based algorithm to discover interesting association rules embedded in the transaction database and the customer database.  相似文献   

Data mining extracts implicit, previously unknown, and potentially useful information from databases. Many approaches have been proposed to extract information, and one of the most important ones is finding association rules. Although a large amount of research has been devoted to this subject, none of it finds association rules from directed acyclic graph (DAG) data. Without such a mining method, the hidden knowledge, if any, cannot be discovered from the databases storing DAG data such as family genealogy profiles, product structures, XML documents, task precedence relations, and course structures. In this article, we define a new kind of association rule in DAG databases called the predecessor–successor rule, where a node x is a predecessor of another node y if we can find a path in DAG where x appears before y. The predecessor–successor rules enable us to observe how the characteristics of the predecessors influence the successors. An approach containing four stages is proposed to discover the predecessor–successor rules. © 2006 Wiley Periodicals, Inc. Int J Int Syst 21: 621–637, 2006.  相似文献   

Mining multiple-level association rules in large databases   总被引:2,自引:0,他引:2  
A top-down progressive deepening method is developed for efficient mining of multiple-level association rules from large transaction databases based on the a priori principle. A group of variant algorithms is proposed based on the ways of sharing intermediate results, with the relative performance tested and analyzed. The enforcement of different interestingness measurements to find more interesting rules, and the relaxation of rule conditions for finding “level-crossing” association rules, are also investigated. The study shows that efficient algorithms can be developed from large databases for the discovery of interesting and strong multiple-level association rules  相似文献   

宫雨 《计算机工程与设计》2007,28(24):5838-5840
约束关联规则是关联规则研究中的重要问题,目前的研究大多集中在单变量约束,对双变量约束的研究较少,而双变量约束在实际中也有重要作用.针对这种情况,提出了双变量约束中具有下界约束的关联规则问题.在此基础上,给出了下界约束的定义,然后分析了满足下界约束频繁集的性质,并给出了相关的证明.最后提出了基于FP-Tree的下界约束算法,采用了预先测试的方法,降低了需要测试项集的数量和计算成本.实验结果表明,该算法具有较高的效率.  相似文献   

Mining fuzzy association rules for classification problems   总被引:3,自引:0,他引:3  
The effective development of data mining techniques for the discovery of knowledge from training samples for classification problems in industrial engineering is necessary in applications, such as group technology. This paper proposes a learning algorithm, which can be viewed as a knowledge acquisition tool, to effectively discover fuzzy association rules for classification problems. The consequence part of each rule is one class label. The proposed learning algorithm consists of two phases: one to generate large fuzzy grids from training samples by fuzzy partitioning in each attribute, and the other to generate fuzzy association rules for classification problems by large fuzzy grids. The proposed learning algorithm is implemented by scanning training samples stored in a database only once and applying a sequence of Boolean operations to generate fuzzy grids and fuzzy rules; therefore, it can be easily extended to discover other types of fuzzy association rules. The simulation results from the iris data demonstrate that the proposed learning algorithm can effectively derive fuzzy association rules for classification problems.  相似文献   

基于参考度的关联规则挖掘   总被引:1,自引:0,他引:1  
针对现有关联规则挖掘的评价标准存在的问题,提出在评价标准中增加参考度,并给出了参考度的定义和基于参考度的关联规则挖掘算法。利用参考度将关联规则分为正关联规则、负关联规则和无效关联规则,从而可以用算法挖掘带有负项的关联规则。最后给出了新算法的实验分析。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号