首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 260 毫秒
1.
数据挖掘中关联规则算法的研究   总被引:1,自引:0,他引:1       下载免费PDF全文
目前,人们已经提出了许多挖掘关联规则的算法及其变型,其中最著名的是Apriori算法,但传统的算法效率太低。为了解决这些问题,本文提出了一种快速更新的关联挖掘算法。  相似文献   

2.
提出了一种挖掘量化关联规则的MQAR算法。此算法在挖掘关联规则时,只需扫描事务数据库一遍,提高了数据挖掘的效率;并且存放辅助信息所占的内存空间大大少于现有的挖掘算法;同时此算法不仅能挖掘出有关联的数据项集,还能找出这些项集之间数量上的相互关系。  相似文献   

3.
关联规则数据挖掘方法的研究   总被引:3,自引:0,他引:3  
首先简要地介绍数据挖掘和关联规则的概念、关联规则的基本原理及种类。然后详细地介绍了关联规则挖掘研究现状,讨论了Apriori算法的基本原理,同时也指出了Apfiofi算法的一些不足。针对这些不足提出了解决方法,描述了几种改进算法。最后对关联规则挖掘下一步的研究方向进行了展望。  相似文献   

4.
基于互关联后继树的多时间序列关联模式挖掘   总被引:3,自引:1,他引:3  
时间序列是现实生活中常见的数据形式之一,在时间序列中发现频繁模式是分析时间序列变化规律的一项重要任务.提出基于互关联后继树的多时间序列关联模式挖掘算法.该算法首先用Allen逻辑位置关系来描述序列状态关系,根据这些关系在时间窗口内顺序或并行出现情况,获得一个由这些关系组成的特殊序列.在此基础上提出了一个基于互关联后继树的新型挖掘模型,实现了序列间关联模式的挖掘.与其他方法相比,该算法简单、直观,而且整个挖掘过程不需要生成候选模式,大大提高挖掘效率.  相似文献   

5.
发现含有第一类项目约束的频繁集的快速算法   总被引:3,自引:0,他引:3  
与Apriori-like类型的算法相比,Zaki提出的基于垂直数据库结构及基于网络理论的算法将关联规则挖掘的运行速度提高了一个数量级,并且这些算法非常适合挖掘低支持度、长模式的关联规则。以Ecalt算法为原型,讨论了如何将项目约束引入关联规则挖掘过程的问题,从理论上证明了引入约束后的Eclat+算法可以大大提高算法的效率和速度,并对相关的算法进行了比较。  相似文献   

6.
基于最近挖掘结果的关联规则更新算法   总被引:3,自引:0,他引:3  
Apriori算法是著名的关联规则挖掘算法,它必须对数据库进行多次遍历,针对关联规则的维护问题,提出利用最近一次关联规则的挖掘结果进行更新的算法,仅需对数据库进行两次遍历,提高了关联规则的更新效率。  相似文献   

7.
深入分析关联规则Apriori算法   总被引:1,自引:0,他引:1  
目前,已经提出了许多挖掘关联规则的算法,其中最著名的是Apriori算法及其变型。这些传统的算法大多存在项集生成瓶颈和难以确定合适的支持度阈值的问题,并且没有考虑数据库的被分析项的各自不同的重要性。深入分析研究关联规则Apriori算法,并给出几种改进的算法。  相似文献   

8.
关联规则挖掘研究   总被引:1,自引:0,他引:1  
关联规则挖掘是一种重要的数据挖掘技术,缘自"啤酒与尿布"问题出现这项技术以来,已有许多学者提出了多种关联规则挖掘算法。这些关联规则挖掘算法主要分为以Apriori为代表的"产生-测试"范型和以FP-growth为代表的采用复杂数据结构压缩存储空间的范型。文章将这两种代表算法进行了对比分析。  相似文献   

9.
分布式系统中关联规则挖掘研究   总被引:5,自引:0,他引:5  
在分布式系统中如何挖掘关联规则是数据挖掘领域研究的一个重要课题。本文对关联规则分布式挖掘问题进行探讨,给出了关联规则分布式挖掘系统DAMINER的体系结构,提出了一种基于DAMINER的关联规则分布式挖掘算法ARDM。该算法具有通信代价小和时间开销少等优点。  相似文献   

10.
关联规则的冗余删除与聚类   总被引:9,自引:0,他引:9  
关联规则挖掘常常会产生大量的规则,这使得用户分析和利用这些规则变得十分困难,尤其是数据库中属性高度相关时,问题更为突出.为了帮助用户做探索式分析,可以采用各种技术来有效地减少规则数量,如约束性关联规则挖掘、对规则进行聚类或泛化等技术.本文提出一种关联规则冗余删除算法ADRR和一种关联规则聚类算法ACAR.根据集合具有的性质,证明在挖掘到的关联规则中存在大量可以删除的冗余规则,从而提出了算法ADRR;算法ACAR采用一种新的用项目间的相关性来定义规则间距离的方法,结合DBSCAN算法的思想对关联规则进行聚类.最后将本文提出的算法加以实现,实验结果表明该算法暑有数可行的.且具较高的效率。  相似文献   

11.
熊玉庆 《计算机科学》2015,42(11):101-103
归约算法在并行计算中应用广泛,目前有很多归约算法应用于不同的情形。这些归约算法各不相同, 逻辑拓扑是 造成区别的关键 。为了统一描述归约算法,揭示它们的共性,给出了一个逻辑拓扑的定义及其性质。在此基础上,给出了归约算法的统一描述,以利于对归约算法的理解,从而设计适应不同应用和环境的归约算法。该描述也可视为可集成不同语义的归约算法框架,从而有助于设计具有新语义的归约算法。本质上,该统一描述是一个归约算法形式定义,有助于验证归约算法的正确性。  相似文献   

12.
VPN的性能与加密算法,认证算法和网络环境有关。计算表明,加密算法的吞吐量只有认证算法吞吐量的10%-35%,因此,加密算法对VPN性能的影响比认证算法对VPN性能的影响大,提高加密和认证算法的吞吐量,对提高高带宽网络VPN的性能具有明显的效果。  相似文献   

13.
In this paper, we analyze network recovery algorithms, which allow computer networks to properly function in spite of failures. In this analysis, we use methods and tools of the theory of super-recursive algorithms. The concept of algorithm of the second level is introduced and studied. It is demonstrated that although the main components of various check-point/recovery algorithms are recursive algorithms, check-point/recovery algorithms, as a whole, are super-recursive second-level algorithms. Treating network recovery algorithms as second level algorithms is oriented at developing more powerful algorithms by combining existing ones in a common schema.  相似文献   

14.
We describe parallel algorithms for computing maximal cardinality matching in a bipartite graph on distributed-memory systems. Unlike traditional algorithms that match one vertex at a time, our algorithms process many unmatched vertices simultaneously using a matrix-algebraic formulation of maximal matching. This generic matrix-algebraic framework is used to develop three efficient maximal matching algorithms with minimal changes. The newly developed algorithms have two benefits over existing graph-based algorithms. First, unlike existing parallel algorithms, cardinality of matching obtained by the new algorithms stays constant with increasing processor counts, which is important for predictable and reproducible performance. Second, relying on bulk-synchronous matrix operations, these algorithms expose a higher degree of parallelism on distributed-memory platforms than existing graph-based algorithms.We report high-performance implementations of three maximal matching algorithms using hybrid OpenMP-MPI and evaluate the performance of these algorithm using more than 35 real and randomly generated graphs. On real instances, our algorithms achieve up to 200 × speedup on 2048 cores of a Cray XC30 supercomputer. Even higher speedups are obtained on larger synthetically generated graphs where our algorithms show good scaling on up to 16,384 cores.  相似文献   

15.
A new taxonomy of sublinear (multiple) keyword pattern matching algorithms is presented. Based on an earlier taxonomy by the second and third authors, this new taxonomy includes not only suffix-based algorithms, but also factor- and factor-oracle-based algorithms. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor-oracle-based sublinear keyword pattern matching algorithms can be seen as instantiations of a general sublinear algorithm skeleton. During processing, such algorithms shift or jump through the text in a forward or left-to-right direction, and read backward or right-to-left starting from positions in the text, i.e. they read suffixes of certain prefixes of the text. They use finite automata for efficient computation of string membership in a certain language. In addition, we show shift functions defined for the suffix-based algorithms to be reusable for factor- and factor-oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.  相似文献   

16.
Computing the centroid of an interval type-2 fuzzy set is an important operation in a type-2 fuzzy logic system, and is usually implemented by Karnik–Mendel (KM) iterative algorithms. By connecting KM algorithms and continuous KM algorithms together, this paper gives theoretical explanations on the initialization methods of KM and Enhanced Karnik–Mendel (EKM) algorithms, proposes exact methods for centroid computation of an interval type-2 fuzzy set, and extends the Enhanced Karnik–Mendel (EKM) algorithms to three different forms of weighted EKM (WEKM) algorithms. It shows that EKM algorithms become a special case of the WEKM algorithms when the weights of the latter are constant value. It also shows that, in general, the weighted EKM algorithms have smaller absolute error and faster convergence speed than the EKM algorithms which make them very attractive for real-time applications of fuzzy logic system. Four numerical examples are used to illustrate and analyze the performance of WEKM algorithms.  相似文献   

17.
针对报文分类算法的可扩展性,深入分析了典型可扩展报文分类算法的时间、空间复杂度;基于ClassBench工具集开发出可扩展报文分类算法评测系统,利用该系统对典型算法在不同模拟场景下进行评测,并对各算法的性能差异和适用条件进行了系统分析。最后,对今后可扩展报文分类算法的发展趋势作出了展望。  相似文献   

18.
Only a few classes of quantum algorithms are known which provide a speed-up over classical algorithms. However, these and any new quantum algorithms provide important motivation for the development of quantum computers. In this article new quantum algorithms are given which are based on quantum state tomography. These include an algorithm for the calculation of several quantum mechanical expectation values and an algorithm for the determination of polynomial factors. These quantum algorithms are important in their own right. However, it is remarkable that these quantum algorithms are immune to a large class of errors. We describe these algorithms and provide conditions for immunity.   相似文献   

19.
知识图谱划分算法研究综述   总被引:6,自引:0,他引:6  
知识图谱是人工智能的重要基石,因其包含丰富的图结构和属性信息而受到广泛关注.知识图谱可以精确语义描述现实世界中的各种实体及其联系,其中顶点表示实体,边表示实体间的联系.知识图谱划分是大规模知识图谱分布式处理的首要工作,对知识图谱分布式存储、查询、推理和挖掘起基础支撑作用.随着知识图谱数据规模及分布式处理需求的不断增长,如何对其进行划分已成为目前知识图谱研究的热点问题.从知识图谱和图划分的定义出发,系统性地介绍当前知识图谱数据划分的各类算法,包括基本、多级、流式、分布式和其他类型图划分算法.首先,介绍4种基本图划分算法:谱划分算法、几何划分算法、分支定界算法、KL及其衍生算法,这类算法通常用于小规模图数据或作为其他划分算法的一部分;然后,介绍多级图划分算法,这类算法对图粗糙化后进行划分再投射回原始图,根据粗糙化过程分为基于匹配的算法和基于聚合的算法;其次,描述3种流式图划分算法,这类算法将顶点或边加载为序列后进行划分,包括Hash算法、贪心算法、Fennel算法,以及这3种算法的衍生算法;再次,介绍以KaPPa、JA-BE-JA和轻量级重划分为代表的分布式图划分算法及它们的衍生算法;同时,在其他类型图划分算法中,介绍近年来新兴的2种图划分算法:标签传播算法和基于查询负载的算法.通过在合成与真实知识图谱数据集上的丰富实验,比较了5类知识图谱代表性划分算法在划分效果、查询处理与图数据挖掘方面的性能差异,分析实验结果并推广到推理层面,获得了基于实验的知识图谱划分算法性能评价结论.最后,在对已有方法分析和比较的基础上,总结目前知识图谱数据划分面临的主要挑战,提出相应的研究问题,并展望未来的研究方向.  相似文献   

20.
Considering that routing algorithms for the Network on Chip (NoC) architecture is one of the key issues that determine its ultimate performance, several things have to be considered for developing new routing algorithms. This includes examining the strengths, capabilities, and weaknesses of the commonly proposed algorithms as a starting point for developing new ones.
Because most of the algorithms presented are based on the well-known algorithms that are studied and evaluated in this research. Finally, according to the results produced under different conditions, better decisions can be made when using the aforementioned algorithms as well as when presenting new routing algorithms. In this research, we first describe the existing algorithms include: XY, YX, Odd- Even and DyAD. We then evaluate each of the routing algorithms which naturally have their own strengths and weaknesses under different conditions. In the first scenario, based on the criteria of average latency, average throughput and average energy consumption in determining the final performance of the network on the chip, we show the algorithms in terms of their performance by deterministic and adaptive routing algorithms. In the second scenario, we evaluate the algorithms based on the network size and the number of cores on the chip. As a result, these algorithms can make better decisions when using these algorithms as well as when presenting new routing algorithms, considering the results produced under different condition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号