首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
高效的关联规则挖掘算法   总被引:2,自引:0,他引:2  
针对Apriori算法多次扫描数据库且生成的候选项集数量大的缺陷,提出了一种数据库优化策略,并结合修剪频繁集和连接优化策略,得到一种新的关联规则挖掘算法-NApriori算法.该算法减小了数据库的规模以及候选项集的数目,避免了连接过程中相同项目的重复比较.实验表明此方法比Apriori算法有更好的性能.  相似文献   

2.
一种有效的挖掘关联规则更新方法   总被引:1,自引:0,他引:1  
王新 《计算机应用》2005,25(6):1360-1361,1372
在挖掘关联规则过程中,用户往往需要多次调整(增加或减少)最小支持度,才能获得有用的关联规则。给出一个利用已存信息有效产生新候选项目集的PSI算法,结果表明每次扫描数据库时能有效地减少候选项目集的数。  相似文献   

3.
Efficient mining of association rules in distributed databases   总被引:14,自引:0,他引:14  
Many sequential algorithms have been proposed for the mining of association rules. However, very little work has been done in mining association rules in distributed databases. A direct application of sequential algorithms to distributed databases is not effective, because it requires a large amount of communication overhead. In this study, an efficient algorithm called DMA (Distributed Mining of Association rules), is proposed. It generates a small number of candidate sets and requires only O(n) messages for support-count exchange for each candidate set, where n is the number of sites in a distributed database. The algorithm has been implemented on an experimental testbed, and its performance is studied. The results show that DMA has superior performance, when compared with the direct application of a popular sequential algorithm, in distributed databases  相似文献   

4.
Mining class association rules (CARs) is an essential, but time-intensive task in Associative Classification (AC). A number of algorithms have been proposed to speed up the mining process. However, sequential algorithms are not efficient for mining CARs in large datasets while existing parallel algorithms require communication and collaboration among computing nodes which introduces the high cost of synchronization. This paper addresses these drawbacks by proposing three efficient approaches for mining CARs in large datasets relying on parallel computing. To date, this is the first study which tries to implement an algorithm for parallel mining CARs on a computer with the multi-core processor architecture. The proposed parallel algorithm is theoretically proven to be faster than existing parallel algorithms. The experimental results also show that our proposed parallel algorithm outperforms a recent sequential algorithm in mining time.  相似文献   

5.
模糊关联规则用于处理数据库中的不精确信息,并提供一个知识发现的良好表示。利用约束级别表示理论将GUHA模型泛化用于模糊关联规则,通过约束级别管理模糊规则,并给出一个扩展的验证度量过程。使用形式化方法的挖掘算法,在不同的约束级别上并行化挖掘过程,总结得到的结果。算法的复杂度分析以及实验结果表明该形式化方法是有效可行的,从而确立了模糊关联规则表示和评价的逻辑基础。  相似文献   

6.
《Information Systems》1999,24(1):25-46
Discovering association rules is one of the most important task in data mining. Many efficient algorithms have been proposed in the literature. The most noticeable are Apriori, Mannila's algorithm, Partition, Sampling and DIC, that are all based on the Apriori mining method: pruning the subset lattice (itemset lattice). In this paper we propose an efficient algorithm, called Close, based on a new mining method: pruning the closed set lattice (closed itemset lattice). This lattice, which is a sub-order of the subset lattice, is closely related to Wille's concept lattice in formal concept analysis. Experiments comparing Close to an optimized version of Apriori showed that Close is very efficient for mining dense and/or correlated data such as census style data, and performs reasonably well for market basket style data.  相似文献   

7.
约束关联挖掘是在把项或项集限制在用户给定的某一条件或多个条件下的关联挖掘,是一种重要的关联挖掘类型,在现实中有着不少的应用。但由于大多数算法处理的约束条件类型单一,提出一种多约束关联挖掘算法。该算法以FP-growth为基础,创建项集的条件数据库。利用非单调性和单调性约束的性质,采用多种剪枝策略,快速寻找约束点。实验证明,该算法能有效地挖掘多约束条件下的关联规则,且可扩展性能很好。  相似文献   

8.
隐私保护是数据挖掘中很有意义的研究方向。M.Kantarcioglu等提出了针对水平分割数据的保持隐私的关联规则挖掘的算法,探讨了如何在两个垂直分布的私有数据库的联合样本集上施行数据挖掘算法,同时保证不向对方泄露任何与结果无关的数据库数据,针对资料分类算法中应用非常普遍的关联规则挖掘算法,利用安全两方计算协议.给出一个保持隐私的关联规则挖掘协议。  相似文献   

9.
一个有效的分布式并行挖掘关联规则算法   总被引:2,自引:2,他引:2  
提出了一个基于分布式结构的快速有效的关联规则挖掘算法,它采用了分布式结构,各节点并行计算,与相关算法相比有效地减少了通信量和候选项集数目,算法可扩展性好,实现简单。  相似文献   

10.
Some recent studies have shown that association rules can reveal the interactions between genes that might not have been revealed using traditional analysis methods like clustering. However, the existing studies consider only the association rules among individual genes. In this paper, we propose a new data mining method named MAGO for discovering the multilevel gene association rules from the gene microarray data and the concept hierarchy of Gene Ontology (GO). The proposed method can efficiently find out the relations between GO terms by analyzing the gene expressions with the hierarchy of GO. For example, with the biological process in GO, some rules like Process A (up) → Process B (up) cab be discovered, which indicates that the genes involved in Process B of GO are likely to be up-regulated whenever those involved in Process A are up-regulated. Moreover, we also propose a constrained mining method named CMAGO for discovering the multilevel gene expression rules with user-specified constraints. Through empirical evaluation, the proposed methods are shown to have excellent performance in discovering the hidden multilevel gene association rules.  相似文献   

11.
Parallel mining of association rules   总被引:15,自引:0,他引:15  
We consider the problem of mining association rules on a shared nothing multiprocessor. We present three algorithms that explore a spectrum of trade-offs between computation, communication, memory usage, synchronization, and the use of problem specific information. The best algorithm exhibits near perfect scaleup behavior, yet requires only minimal overhead compared to the current best serial algorithm  相似文献   

12.
关联挖掘中的时效度研究   总被引:1,自引:0,他引:1  
传统的关联挖掘算法,以支持度和置信度作为评价标准来衡量规则是否有价值。然而,这种模式不能体现出数据的时效敏感特性,如Web数据和长期积累数据。文中将首次建立一个全新的时基模型来重新估计数据规则的价值,并给出时效度(time validity)作为新的规则价值衡量标准。最后,给出了基于这个新的时基模型的一种新并行算法。这种算法使得我们在挖掘过程中使用增量挖掘,而且使得用户可以通过互操作来优化挖掘过程。  相似文献   

13.
多尺度关联规则挖掘的尺度上推算法   总被引:1,自引:0,他引:1  
  相似文献   

14.
图像关联规则挖掘研究*   总被引:3,自引:0,他引:3  
介绍了图像关联规则的相关概念,描述了传统的双种群遗传算法的执行过程;针对采用固定染色体交叉概率和染色体变异概率容易出现早熟、收敛速度较慢等问题,设计出了能自适应调整的染色体交叉算子和变异算子。最后将改进后的双种群遗传算法成功地运用到Landsat卫星遥感图像,实现了图像关联规则的提取,为退耕还林决策提供了有力的依据。  相似文献   

15.
This paper introduces a new algorithm of mining association rules.The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions.The total number of pa sses over the database is only(k 2m-2)/m,where k is the longest size in the itemsets.It is much less than k .  相似文献   

16.
数据挖掘中关联规则挖掘算法比较研究   总被引:27,自引:12,他引:15  
分析数据挖掘中关联规则挖掘算法的研究现状,提出关联规则新的价值衡量方法和关联规则挖掘今后进一步的研究方向。以核心Apfiofi算法为基点,运用文献查询和比较分析方法对典型的关联规则挖掘算法进行了综合研究:Apfiofi法即使进行了优化,一些固有的缺陷仍然无法克服,还需进一步研究;②今后的研究方向将是提高处理极大量数据和非结构化数据算法的效率、与OLAP相结合以及生成结果的可视化。  相似文献   

17.
关联规则挖掘算法研究   总被引:2,自引:0,他引:2       下载免费PDF全文
关联规则挖掘是数据挖掘的一个重要研究领域。针对经典Apriori算法频繁扫描事务数据库致使运行效率低下的缺点,在研究已有关联规则挖掘算法的基础上,提出一种改进的基于关系矩阵的关联规则挖掘算法。理论分析和实验结果均表明,所提算法是高效的和实用的。  相似文献   

18.
关联规则挖掘技术研究进展*   总被引:3,自引:2,他引:3  
为帮助人们深入研究关联规则挖掘技术,总结了关联规则的分类方法、评价方法以及相关技术的最新进展,特别是对关联规则的主要算法进行了详细的介绍,并探讨未来的发展方向。该研究比较系统全面,对将来进一步深入分析关联规则挖掘技术具有指导意义。  相似文献   

19.
一种新的关联规则挖掘方法   总被引:1,自引:0,他引:1       下载免费PDF全文
关联规则挖掘是数据挖掘的主要任务之一。为了进一步提高关联规则挖掘算法的认知特性和运算效果,提出了一种新的关联规则挖掘思想并由此构造了一种基于规则模糊认知图的关联规则挖掘算法。该算法使用规则模糊认知图进行知识表示,对每个挖掘到的关联规则进行可达模糊推理,从而减少了与数据库交互的次数。实验证明该方法与Apriori的关联规则算法相比,提高了关联规则挖掘的效率,增强了智能化程度。  相似文献   

20.
分布式数据库关联规则的安全挖掘算法研究   总被引:1,自引:0,他引:1  
分布式环境中,进行分布式数据库关联规则的挖掘而不泄露用户的隐私,是非常重要的问题.提出了分布式数据库的关联规则的安全挖掘算法PPDMA(Privacy Preserving Distributed Mining Algorithms),通过应用密码学方法对站点间传送的用于挖掘全局频繁项集的被约束子树及其它信息进行加密,而在接受站点对加密信息进行解密,达到不披露用户信息,起到保护用户隐私的作用,以进行关联规则的安全挖掘.分析表明,该算法是正确可行的.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号