首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
聂成林  王浩  胡学钢 《计算机工程》2003,29(20):60-62,79
研究了基于概念格(Conccpt LatticeCL)的序列模式挖掘方法,并给出了相应算法。与经典的算法比较,减少了对交易数据库的扫描次数,提高了挖掘效率。  相似文献   

2.
序列模式挖掘是数据挖掘的一个重要问题.传统的序列模式仅能揭示频繁出现的项目以及出现的顺序,但不能揭示在前续项目出现的情况下,后续项目出现的时间.在本文中,引入一种新的多时间粒度序列模式,模式中相邻项目之间的转换时间采用从原数据集中导出的、多时间粒度下的最小有界时间区间和平均时间标注.建立了多时间粒度序列模式挖掘模型,提出了一种新的多时间序列模式挖掘算法MG-PrefixSpan.实验表明,算法是有效的.  相似文献   

3.
In this paper, we deal with mining sequential patterns in multiple time sequences. Building on a state-of-the-art sequential pattern mining algorithm PrefixSpan for mining transaction databases, we propose MILE (MIning in muLtiple sEquences), an efficient algorithm to facilitate the mining process. MILE recursively utilizes the knowledge of existing patterns to avoid redundant data scanning, and therefore can effectively speed up the new patterns’ discovery process. Another unique feature of MILE is that it can incorporate prior knowledge of the data distribution in time sequences into the mining process to further improve the performance. Extensive empirical results show that MILE is significantly faster than PrefixSpan. As MILE consumes more memory than PrefixSpan, we also present a solution to trade time efficiency in memory constrained environments.
Xingquan ZhuEmail:
  相似文献   

4.
基于隐私保护的数据挖掘是信息安全和知识发现相结合的产物.提出一种基于隐私保护的序列模式挖掘算法PP-SPM.算法以修改原始数据库中的敏感数据来降低受限序列模式的支持度为原则,首先构建SPAM序列树,根据一定的启发式规则,从中获得敏感序列,再进一步在原始数据库中找到敏感数据,对其做布尔操作,实现数据库的清洗.实验表明,该算法在完全保护隐私的情况下,对于D6C10T2.5S4I4数据集,当修改3.5%的原始数据后,其序列模式丢失率为2%.  相似文献   

5.
序列模式挖掘是一项重要的数据挖掘任务,而Apriori算法是一种有效的关联规则挖掘方法,本文介绍了如何将Apriori算法应用于序列模式挖掘。  相似文献   

6.
7.
8.
序列模式挖掘的并行算法研究   总被引:1,自引:0,他引:1  
马传香  简钟 《计算机工程》2005,31(6):16-17,136
序列模式在许多领域都有着重要的应用,大量的数据和模式需要高效的、可扩展的并行算法.针对目前序列模式挖掘算法存在的普遍问题,提出了一个适合无共享并行环境下的算法PMSP,有效地解决了存储受限以及时效性问题,并将它与当前相对较优的并行算法HPSPM做了比较,实验表明PMSP是有效的.  相似文献   

9.
基于模糊集理论,挖掘带有数量属性的序列模式称为模糊序列模式挖掘。源于AprioriAll算法的模糊序列模式挖掘算法需多次扫描数据库。针对该缺点,提出一种基于序列矩阵表示且只需扫描一次数据库的算法MFSPM。实验表明,算法效率有明显提高。  相似文献   

10.
为了发现网络流量的规律,本文引入了一种有效的网络流量挖掘算法。网络流量模式是一种反映网络访问频率规律的序列模式,引入了一种扩展的prefixspan算法,将这些序列作为前缀去递归挖掘,并构造一个投影数据库,该算法改进了候选子序列生成效率,前缀投影减少了投影数据库的大小,从而改进了处理效率。  相似文献   

11.
Sequential mining is the process of applying data mining techniques to a sequential database for the purposes of discovering the correlation relationships that exist among an ordered list of events. An important application of sequential mining techniques is web usage mining, for mining web log accesses, where the sequences of web page accesses made by different web users over a period of time, through a server, are recorded. Web access pattern tree (WAP-tree) mining is a sequential pattern mining technique for web log access sequences, which first stores the original web access sequence database on a prefix tree, similar to the frequent pattern tree (FP-tree) for storing non-sequential data. WAP-tree algorithm then, mines the frequent sequences from the WAP-tree by recursively re-constructing intermediate trees, starting with suffix sequences and ending with prefix sequences.This paper proposes a more efficient approach for using the WAP-tree to mine frequent sequences, which totally eliminates the need to engage in numerous re-construction of intermediate WAP-trees during mining. The proposed algorithm builds the frequent header node links of the original WAP-tree in a pre-order fashion and uses the position code of each node to identify the ancestor/descendant relationships between nodes of the tree. It then, finds each frequent sequential pattern, through progressive prefix sequence search, starting with its first prefix subsequence event. Experiments show huge performance gain over the WAP-tree technique.  相似文献   

12.
孙蕾  朱玉全 《计算机工程》2006,32(11):95-96,99
如何确定候选频繁序列模式以及如何计算它们的支持数是序列模式挖掘中的两个关键问题。该文提出了一种基于二进制形式的候选频繁序列模式生成和相应的支持数计算方法,该方法只需对挖掘对象进行一些“或”、“与”、“异或”等逻辑运算操作,显著降低了算法的实现难度,将该方法与频繁序列模式挖掘及更新算法相结合,可以进一步提高算法的执行效率。  相似文献   

13.
在网络流量模式挖掘中,发现邻接序列模式(CSP)是一个重要问题,为网络流量分析提出了一种新的树型数据结构。为了有效存储包含指定项的所有序列,该树组合了前缀树和后缀树,这种特殊的树结构确保了CSP检测的有效性。实验表明与已有方法相比,使用该结构不仅改进了CSP挖掘的时间性能,而且改进了空间性能。  相似文献   

14.
提高序列模式挖掘算法效率的关键在于减少发现频繁序列的时间.文中基于CTID概念提出了一种改进的频繁序列模式挖掘算法--SPM,它充分利用频繁项集和中间挖掘结果,得到更多有效的序列模式,并简化了剪枝步骤,从而提高了算法效率.实验证明该算法可行.  相似文献   

15.
PretixSpan算法解决了类Apriori算法的不足,但产生的投影数据库花费了较多的存储空间及扫描时间.本文基于PretixSpan算法提出PSD算法,舍弃了对非频繁项的存储及对投影序列数小于最小支持数的投影数据库的扫描,减少了不必要的存储空间,提高了查询速度.实验证明,PSD算法比PretixSpan算法具有更好的时空性能.  相似文献   

16.
通过对网络业务进行分析来达到对网络性能进行评价和优化变得日益重要,本文给出了一种新的网络业务分析方法-路径约束序列模式挖掘算法(PRSP),该算法利用频繁数据项集的性质,在求出候选频繁项集的同时也求出了其支持度,并且在求候选频繁序列时也减少了候选频繁序列的个数,极大提高了挖掘的效率和速度,实验结果表明,该算法是有效的。  相似文献   

17.
We have implemented a technique for execution of formal, model-based specifications. The specifications we can execute are written at a level of abstraction that is close to that used in nonexecutable specifications. The specification abstractions supported by our execution technique include using quantified assertions to directly construct post-state values, and indirect definitions of post-state values (definitions that do not use equality). Our approach is based on translating specifications to the concurrent constraint programming language AKL. While there are, of course, expressible assertions that are not executable, our technique is amenable to any formal specification language based on a finite number of intrinsic types and pre- and postcondition assertions.  相似文献   

18.
Given a large spatio-temporal database of events, where each event consists of the fields event ID, time, location, and event type, mining spatio-temporal sequential patterns identifies significant event-type sequences. Such spatio-temporal sequential patterns are crucial to the investigation of spatial and temporal evolutions of phenomena in many application domains. Recent research literature has explored the sequential patterns on transaction data and trajectory analysis on moving objects. However, these methods cannot be directly applied to mining sequential patterns from a large number of spatio-temporal events. Two major research challenges still remain: 1) the definition of significance measures for spatio-temporal sequential patterns to avoid spurious ones and 2) the algorithmic design under the significance measures, which may not guarantee the downward closure property. In this paper, we propose a sequence index as the significance measure for spatio-temporal sequential patterns, which is meaningful due to its interpretability using spatial statistics. We propose a novel algorithm called Slicing-STS-miner to tackle the algorithmic design challenge using the spatial sequence index, which does not preserve the downward closure property. We compare the proposed algorithm with a simple algorithm called STS-miner that utilizes the weak monotone property of the sequence index. Performance evaluations using both synthetic and real-world data sets show that the slicing-STS-miner is an order of magnitude faster than STS-Miner for large data sets.  相似文献   

19.
基于投影数据集的序列模式增量挖掘算法   总被引:1,自引:0,他引:1  
提出一种基于投影数据集的序列增量更新算法Inc_SPM,该算法以PrefixSpan算法为基础。首先利用已有的知识得出频繁1序列,然后生成投影数据集以迭代产生频繁k序列;同时为了控制投影数据集的规模,利用等价投影数据集来改进投影终止条件。  相似文献   

20.
The authors survey the research and development in Sweden in constraint programming, which is rapidly becoming the method of choice for some kinds of constraint problems, such as scheduling and configuration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号