首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A new approach to online generation of association rules   总被引:6,自引:0,他引:6  
We discuss the problem of online mining of association rules in a large database of sales transactions. The online mining is performed by preprocessing the data effectively in order to make it suitable for repeated online queries. We store the preprocessed data in such a way that online processing may be done by applying a graph theoretic search algorithm whose complexity is proportional to the size of the output. The result is an online algorithm which is independent of the size of the transactional data and the size of the preprocessed data. The algorithm is almost instantaneous in the size of the output. The algorithm also supports techniques for quickly discovering association rules from large itemsets. The algorithm is capable of finding rules with specific items in the antecedent or consequent. These association rules are presented in a compact form, eliminating redundancy. The use of nonredundant association rules helps significantly in the reduction of irrelevant noise in the data mining process  相似文献   

2.
李斌  马戈  孙志挥 《计算机应用》2004,24(12):105-107
从项目集发生变化的角度考虑数据库的增量式更新问题,并有效的改进了著名的FUP算法,提出了项目集发生变化的快速增量式更新算法FUPIC(Fast UPdate algorithm for Itemsets Changed)。  相似文献   

3.
韩涛  张春海  李华 《计算机工程与设计》2005,26(7):1842-1844,1899
关联是数据挖掘领域的一个重要研究课题。对模糊关联规则挖掘进行了研究,针对普通关联规则不能精确表达数据库中模糊信息关联性的问题,提出了一种新的模糊关联规则挖掘算法FARM_New,结果表明算法是有效的,提高了模糊挖掘的速度。  相似文献   

4.
一种基于关系矩阵的关联规则快速挖掘算法   总被引:13,自引:0,他引:13  
胡慧蓉  王周敬 《计算机应用》2005,25(7):1577-1579
首先对关联规则挖掘问题进行了简单的回顾,然后应用关系理论思想,引入了项目可辨识向量及其“与”运算,设计了一种快速挖掘算法SLIG,将频繁项目集的产生过程转化为项目集的关系矩阵中向量运算过程。算法只需扫描一遍数据库,克服了Aprori及其相关算法产生大量候选集和需多次扫描数据库的缺点。实验证明,与Aprori算法相比,SLIG算法提高了挖掘效率。  相似文献   

5.
In order to efficiently trace the changes of association rules over an online data stream, this paper proposes a method of generating association rules directly over the changing set of currently frequent itemsets. While all of the currently frequent itemsets in an online data stream are monitored by the estDec method, all the association rules of every frequent itemset in the prefix tree of the estDec method are generated by the proposed method in this paper. For this purpose, a traversal stack is introduced to efficiently enumerate all association rules in the prefix tree. This online implementation can avoid the drawbacks of the conventional two-step approach. In addition, the prefix tree itself can be utilized as an index structure for finding the current support of the antecedent of an association rule. Finally, the performance of the proposed method is analyzed by a series of experiments to identify its various characteristics.  相似文献   

6.
Researchers realized the importance of integrating fuzziness into association rules mining in databases with binary and quantitative attributes. However, most of the earlier algorithms proposed for fuzzy association rules mining either assume that fuzzy sets are given or employ a clustering algorithm, like CURE, to decide on fuzzy sets; for both cases the number of fuzzy sets is pre-specified. In this paper, we propose an automated method to decide on the number of fuzzy sets and for the autonomous mining of both fuzzy sets and fuzzy association rules. We achieve this by developing an automated clustering method based on multi-objective Genetic Algorithms (GA); the aim of the proposed approach is to automatically cluster values of a quantitative attribute in order to obtain large number of large itemsets in less time. We compare the proposed multi-objective GA based approach with two other approaches, namely: 1) CURE-based approach, which is known as one of the most efficient clustering algorithms; 2) Chien et al. clustering approach, which is an automatic interval partition method based on variation of density. Experimental results on 100 K transactions extracted from the adult data of USA census in year 2000 showed that the proposed automated clustering method exhibits good performance over both CURE-based approach and Chien et al.’s work in terms of runtime, number of large itemsets and number of association rules.  相似文献   

7.
贾桂霞  张永 《计算机工程与设计》2006,27(12):2175-2177,2186
在数据挖掘领域,关联规则的挖掘和基于粗糙集理论抽取决策规则是两种截然不同的方法,但在统计意义下两种方法产生的规则基本相同。结合关联规则挖掘方法和粗糙集方法的优点,基于Apriori算法提出一种优化算法,获取具有一定支持度和可信度阈值且不产生冗余的决策规则,以提高粗糙集属性值约简算法的性能。  相似文献   

8.
矢量空间数据库中关联规则的挖掘算法研究   总被引:2,自引:0,他引:2  
按照矢量空间数据的特点和空间数据挖掘的要求,以GIS的空间分析和空间数据处理为工具,探讨了矢量空间数据库中关联规则挖掘的数据处理方法,提出了关联规则的挖掘算法。最后,通过实例进行了验证。  相似文献   

9.
基于半空间和GA的关联规则快速挖掘算法   总被引:2,自引:0,他引:2       下载免费PDF全文
提出了一种利用半空间模型和遗传算法(GA)对关联规则进行快速挖掘的方法。传统关联规则挖掘算法往往受到数据类型、关联规则的实际意义等约束,大大限制了知识获取的能力。而此方法不再受到上述限制的困扰,并且可以挖掘出用户感兴趣的规则,尤其对于大规模样本集的效果也是相当不错的。  相似文献   

10.
在本文中,我们针对动态关联规则挖掘问题提出两个有效的处理算法,即EIM-A和EIM-G算法.它们能根据数据库的动态变化,高效地进行关联规则的更新.通过知识数据库的维护,最多只需要扫描原始数据库一次,就能得到所需的频繁项目集,能有效地降低更新关联规则所需的成本.  相似文献   

11.
分析了并行关联规则挖掘算法存在的不足,提出了一种改进的关联规则挖掘的多核并行优化算法。该算法对Apriori算法的压缩矩阵进行了改造,并在多核平台下利用OpenMP技术和TBB技术对串行程序进行循环并行化和任务分配的并行化设计,最大限度地实现并行关联规则挖掘。  相似文献   

12.
基于关联规则的入侵检测算法研究综述   总被引:1,自引:0,他引:1  
介绍了关联规则及其特点,并对经典Apriori算法及其改进算法做了描述。对基于关联规则的模糊集、概念格、粗糙集、遗传算法、加权和有向图等算法研究进行了综述。针对入侵检测算法的缺点及其在入侵检测方面的研究应用,从入侵检测误报率和漏报率较高,检测速度和匹配效率较低等方面做了分析、改进。最后,指出了在该领域需要进一步研究的热点问题和发展趋势。  相似文献   

13.
This paper defines a new kind of rule,probability functional dependency rule.The functional dependency degree can be depicted by this kind of rule.Five algorithms,from the simple to the complex,are presented to mine this kind of rule in different condition.The related theorems are proved to ensure the high efficiency and the correctness of the above algorithms.  相似文献   

14.

The paper addresses the problem of e-customer behavior characterization based on Web server log data. We describe user sessions with the number of session features and aim to identify the features indicating a high probability of making a purchase for two customer groups: traditional customers and innovative customers. We discuss our approach aimed at assessing a purchase probability in a user session depending on categories of viewed products and session features. We apply association rule mining to real online bookstore data. The results show differences in factors indicating a high purchase probability in session for both customer types. The discovered association rules allow us to formulate some predictions for the online store, e.g. that a logged user who has viewed only traditional, printed books, has been staying in the store from 10 to 25 min, and has opened between 30 and 75 pages, will decide to confirm a purchase with the probability of more than 92 %.

  相似文献   

15.
16.
针对传统的关联规则蕴含式表示方式和图形可视化方法对非专家用户来说不易理解的问题,提出了一种新的基于自然语言生成的关联规则可视化方法。该方法将自然语言生成技术引入到关联规则可视化中,通过领域知识库中的解释模式将关联规则中每一项生成简单的自然语言句子,并经过句子规划、表层实现,最终生成流畅的自然语言句子。实验最终得出的结果,便于普通用户理解和应用,从而帮助用户获取更有价值的信息。  相似文献   

17.
As dot-com bubble burst in 2002, an uncountable number of small-sized online shopping malls have emerged every day due to many good characteristics of online marketplace, including significantly reduced search costs and menu cost for products or services and easily accessing products or services in the world. However, all the online shopping malls have not continuously flourished. Many of them even vanished because of the lack of customer relationship management (CRM) strategies that fit them. The objective of this paper is to propose CRM strategies for small-sized online shopping mall based on association rules and sequential patterns obtained by analyzing the transaction data of the shop. We first defined the VIP customers in terms of recency, frequency and monetary (RFM) value. Then, we developed a model which classifies customers into VIP or non-VIP, using various data mining techniques such as decision tree, artificial neural network, logistic regression and bagging with each of these as a base classifier. Last, we identified association rules and sequential patterns from the transactions of VIPs, and then these rules and patterns were utilized to propose CRM strategies for the online shopping mall.  相似文献   

18.
19.
Shah  D. Gupta  P. 《Micro, IEEE》2001,21(1):36-47
One popular hardware device for performing fast routing lookups and packet classification is a ternary content-addressable memory (TCAM). This paper proposes two algorithms to manage the TCAM such that incremental update times remain small in the worst case  相似文献   

20.
Fast algorithms for low-level vision   总被引:9,自引:0,他引:9  
A recursive filtering structure is proposed that drastically reduces the computational effort required for smoothing, performing the first and second directional derivatives, and carrying out the Laplacian of an image. These operations are done with a fixed number of multiplications and additions per output point independently of the size of the neighborhood considered. The key to the approach is, first, the use of an exponentially based filter family and, second, the use of the recursive filtering. Applications to edge detection problems and multiresolution techniques are considered, and an edge detector allowing the extraction of zero-crossings of an image with only 14 operations per output element at any resolution is proposed. Various experimental results are shown  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号