首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Mining dynamic association rules with comments   总被引:2,自引:2,他引:0  
In this paper, we study a new problem of mining dynamic association rules with comments (DAR-C for short). A DAR-C contains not only rule itself, but also its comments that specify when to apply the rule. In order to formalize this problem, we first present the expression method of candidate effective time slots, and then propose several definitions concerning DAR-C. Subsequently, two algorithms, namely ITS2 and EFP-Growth2, are developed for handling the problem of mining DAR-C. In particular, ITS2 is an improved two-stage dynamic association rule mining algorithm, while EFP-Growth2 is based on the EFP-tree structure and is suitable for mining high-density mass data. Extensive experimental results demonstrate that the efficiency and scalability of our proposed two algorithms (i.e., ITS2 and EFP-Growth2) on DAR-C mining tasks, and their practicability on real retail dataset.  相似文献   

2.
宫雨 《计算机工程与设计》2007,28(24):5838-5840
约束关联规则是关联规则研究中的重要问题,目前的研究大多集中在单变量约束,对双变量约束的研究较少,而双变量约束在实际中也有重要作用.针对这种情况,提出了双变量约束中具有下界约束的关联规则问题.在此基础上,给出了下界约束的定义,然后分析了满足下界约束频繁集的性质,并给出了相关的证明.最后提出了基于FP-Tree的下界约束算法,采用了预先测试的方法,降低了需要测试项集的数量和计算成本.实验结果表明,该算法具有较高的效率.  相似文献   

3.
Mining optimized gain rules for numeric attributes   总被引:7,自引:0,他引:7  
Association rules are useful for determining correlations between attributes of a relation and have applications in the marketing, financial, and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support, confidence, or gain of the rule is maximized. In this paper, we generalize the optimized gain association rule problem by permitting rules to contain disjunctions over uninstantiated numeric attributes. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving the uninstantiated attribute. For rules containing a single numeric attribute, we present an algorithm with linear complexity for computing optimized gain rules. Furthermore, we propose a bucketing technique that can result in a significant reduction in input size by coalescing contiguous values without sacrificing optimality. We also present an approximation algorithm based on dynamic programming for two numeric attributes. Using recent results on binary space partitioning trees, we show that the approximations are within a constant factor of the optimal optimized gain rules. Our experimental results with synthetic data sets for a single numeric attribute demonstrate that our algorithm scales up linearly with the attribute's domain size as well as the number of disjunctions. In addition, we show that applying our optimized rule framework to a population survey real-life data set enables us to discover interesting underlying correlations among the attributes.  相似文献   

4.
《Information Systems》2001,26(6):425-444
Mining association rules on large data sets have received considerable attention in recent years. Association rules are useful for determining correlations between attributes of a relation and have applications in marketing, financial and retail sectors. Furthermore, optimized association rules are an effective way to focus on the most interesting characteristics involving certain attributes. Optimized association rules are permitted to contain uninstantiated attributes and the problem is to determine instantiations such that either the support, confidence or gain of the rule is maximized. In this paper, we generalize the optimized support association rule problem by permitting rules to contain disjunctions over uninstantiated numeric attributes. Our generalized association rules enable us to extract more useful information about seasonal and local patterns involving the uninstantiated attribute. For rules containing a single numeric attribute, we present a dynamic programming algorithm for computing optimized association rules. Furthermore, we propose bucketing technique for reducing the input size, and a divide and conquer strategy that improves the performance significantly without sacrificing optimality. We also present approximation algorithms based on dynamic programming for two numeric attributes. Our experimental results for a single numeric attribute indicate that our bucketing and divide and conquer enhancements are very effective in reducing the execution times and memory requirements of our dynamic programming algorithm. Furthermore, they show that our algorithms scale up almost linearly with the attribute's domain size as well as the number of disjunctions.  相似文献   

5.
借助模糊概念和模糊运算,对时间区间的描述很容易实现。对于指定的日历模式,不同的时间区间可根据它们的隶属度具有不同的权重。在模糊日历代数基础上,结合增量挖掘和累进计数的思想,提出了一种基于模糊日历的模糊时序关联规则挖掘方法。理论分析和实验结果均表明,该算法是高效可行的。  相似文献   

6.
We develop techniques for discovering patterns with periodicity in this work. Patterns with periodicity are those that occur at regular time intervals, and therefore there are two aspects to the problem: finding the pattern, and determining the periodicity. The difficulty of the task lies in the problem of discovering these regular time intervals, i.e., the periodicity. Periodicities in the database are usually not very precise and have disturbances, and might occur at time intervals in multiple time granularities. To overcome these difficulties and to be able to discover the patterns with fuzzy periodicity, we propose the fuzzy periodic calendar which defines fuzzy periodicities. Furthermore, we develop algorithms for mining fuzzy periodicities and the fuzzy periodic association rules within them. Experimental results have shown that our method is effective in discovering fuzzy periodic association rules.  相似文献   

7.
Abstract: The concept of fuzzy sets is one of the most fundamental and influential tools in the development of computational intelligence. In this paper the fuzzy pincer search algorithm is proposed. It generates fuzzy association rules by adopting combined top-down and bottom-up approaches. A fuzzy grid representation is used to reduce the number of scans of the database and our algorithm trims down the number of candidate fuzzy grids at each level. It has been observed that fuzzy association rules provide more realistic visualization of the knowledge extracted from databases.  相似文献   

8.
高置信度关联规则的挖掘   总被引:3,自引:1,他引:2       下载免费PDF全文
传统的关联规则和基于效用的关联规则,会忽略一些支持度或效用值不高、置信度(又称可信度)却非常高的规则,这些置信度很高的规则能帮助人们满足规避风险、提高成功率的期望。为挖掘这些低支持度(或效用值)、高置信度的规则,提出了HCARM算法。HCARM采用了划分的方法来处理大数据集,利用新的剪枝策略压缩搜索空间。同时,通过设定长度阈值minlen,使HCARM适合长模式挖掘。实验结果表明,该方法对高置信度长模式有效。  相似文献   

9.
为了提高关联规则挖掘效率,在挖掘频繁项目集的同时,挖掘出包含频繁项目集的事务集,提出了基于字符权图的关联规则挖掘算法。首先,提出了字符权图的概念,发现和证明了它的一些性质。基于此,提出了挖掘频繁项目集及包含频繁项目集的事务集的算法。时间和空间复杂性的分析表明,该算法是合理和高效的。  相似文献   

10.
分组多支持度关联规则研究   总被引:4,自引:1,他引:3  
关联规则是数据挖掘的重要任务之一,传统关联规则算法只有一个最小支持度,假设项出现的频率大致相同,而在谮实际中并非如此,由此产生了多支持度关联规则问题.该问题针对每个项给定不同的支持度,而在实际应用中项可以划分成若干个组,每组有一个支持度.由此提出了分组多支持度关联规则问题,针对该问题给出了基于多支持度性质对项进行分组的方法.该方法可以降低2-项候选集的数目.在此基础上,进一步给出了相应的多支持度关联规则发现算法,并通过实验证明了算法的有效性.  相似文献   

11.
Data-mining is the process of extracting desirable knowledge or interesting patterns from existing databases for specific purposes. Most conventional data-mining algorithms identify the relationships among transactions using binary values, however, transactions with quantitative values are commonly seen in real-world applications. This paper thus proposes a new data-mining algorithm for extracting interesting knowledge from transactions stored as quantitative values. The proposed algorithm integrates fuzzy set concepts and the apriori mining algorithm to find interesting fuzzy association rules in given transaction data sets. Experiments with student grades at I-Shou University were also made to verify the performance of the proposed algorithm.  相似文献   

12.
数据库中动态关联规则的挖掘   总被引:7,自引:0,他引:7  
关联规则能挖掘变量间的相互依赖关系,但是不能反映规则本身的变化规律.为此本文提出了动态关联规则.首先将整个待挖掘数据集按时间划分成若干子集,每个子集挖掘得到的每条规则分别生成一个支持度和一个置信度,这样每条规则在全集上就对应了一个支持度向量和一个置信度向量.通过分析支持度向量和置信度向量,不仅可以发现规则随时间变化的情况,也能够预测规则的发展趋势.本文还提出了两个挖掘动态关联规则的算法,且对他们做了比较.并给出了柱状图和时间序列两种方法分析这两个向量.最后给出了一个挖掘动态关联规则的应用实例。  相似文献   

13.
This article presents a new differential evolution (DE) algorithm for mining optimized statistically significant fuzzy association rules that are abundant in number and high in rule interestingness measure (RIM) values, with strict control over the risk of spurious rules. The risk control over spurious rules, as the most distinctive feature of the proposed DE compared with existing evolutionary algorithms (EAs) for association rule mining (ARM), is realized via two new statistically sound significance tests on the rules. The two tests, in the experimentwise and generationwise adjustment approach, can respectively limit the familywise error rate (the probability that any spurious rules occur in the ARM result) and percentage of spurious rules upon the user specified level. Experiments on variously sized data show that the proposed DE can keep the risk of spurious rules well below the user specified level, which is beyond the ability of existing EA-based ARM. The new method also carries forward the advantages of EA-based ARM and distinctive merits of DE in optimizing the rules: it can obtain several times as many rules and as high RIM values as conventional non-evolutionary ARM, and even more informative rules and better RIM values than genetic-algorithm-based ARM. Case studies on hotel room price determinants and wildfire risk factors demonstrate the practical usefulness of the proposed DE.  相似文献   

14.
15.
The Journal of Supercomputing - Association rule mining (ARM) is a data mining technique to discover interesting associations between datasets. The frequent pattern-growth (FP-growth) is an...  相似文献   

16.
Mining association rules using inverted hashing and pruning   总被引:2,自引:0,他引:2  
In this paper, we propose a new algorithm named Inverted Hashing and Pruning (IHP) for mining association rules between items in transaction databases. The performance of the IHP algorithm was evaluated for various cases and compared with those of two well-known mining algorithms, Apriori algorithm [Proc. 20th VLDB Conf., 1994, pp. 487-499] and Direct Hashing and Pruning algorithm [IEEE Trans. on Knowledge Data Engrg. 9 (5) (1997) 813-825]. It has been shown that the IHP algorithm has better performance for databases with long transactions.  相似文献   

17.
Association rule mining is one of most popular data analysis methods that can discover associations within data. Association rule mining algorithms have been applied to various datasets, due to their practical usefulness. Little attention has been paid, however, on how to apply the association mining techniques to analyze questionnaire data. Therefore, this paper first identifies the various data types that may appear in a questionnaire. Then, we introduce the questionnaire data mining problem and define the rule patterns that can be mined from questionnaire data. A unified approach is developed based on fuzzy techniques so that all different data types can be handled in a uniform manner. After that, an algorithm is developed to discover fuzzy association rules from the questionnaire dataset. Finally, we evaluate the performance of the proposed algorithm, and the results indicate that our method is capable of finding interesting association rules that would have never been found by previous mining algorithms.  相似文献   

18.
Mining multiple-level association rules in large databases   总被引:2,自引:0,他引:2  
A top-down progressive deepening method is developed for efficient mining of multiple-level association rules from large transaction databases based on the a priori principle. A group of variant algorithms is proposed based on the ways of sharing intermediate results, with the relative performance tested and analyzed. The enforcement of different interestingness measurements to find more interesting rules, and the relaxation of rule conditions for finding “level-crossing” association rules, are also investigated. The study shows that efficient algorithms can be developed from large databases for the discovery of interesting and strong multiple-level association rules  相似文献   

19.
Mining fuzzy association rules from uncertain data   总被引:3,自引:3,他引:0  
Association rule mining is an important data analysis method that can discover associations within data. There are numerous previous studies that focus on finding fuzzy association rules from precise and certain data. Unfortunately, real-world data tends to be uncertain due to human errors, instrument errors, recording errors, and so on. Therefore, a question arising immediately is how we can mine fuzzy association rules from uncertain data. To this end, this paper proposes a representation scheme to represent uncertain data. This representation is based on possibility distributions because the possibility theory establishes a close connection between the concepts of similarity and uncertainty, providing an excellent framework for handling uncertain data. Then, we develop an algorithm to mine fuzzy association rules from uncertain data represented by possibility distributions. Experimental results from the survey data show that the proposed approach can discover interesting and valuable patterns with high certainty.  相似文献   

20.
Data mining is most commonly used in attempts to induce association rules from databases which can help decision-makers easily analyze the data and make good decisions regarding the domains concerned. Different studies have proposed methods for mining association rules from databases with crisp values. However, the data in many real-world applications have a certain degree of imprecision. In this paper we address this problem, and propose a new data-mining algorithm for extracting interesting knowledge from databases with imprecise data. The proposed algorithm integrates imprecise data concepts and the fuzzy apriori mining algorithm to find interesting fuzzy association rules in given databases. Experiments for diagnosing dyslexia in early childhood were made to verify the performance of the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号