首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Unil Yun 《ETRI Journal》2007,29(3):336-352
Sequential pattern mining has become an essential task with broad applications. Most sequential pattern mining algorithms use a minimum support threshold to prune the combinatorial search space. This strategy provides basic pruning; however, it cannot mine correlated sequential patterns with similar support and/or weight levels. If the minimum support is low, many spurious patterns having items with different support levels are found; if the minimum support is high, meaningful sequential patterns with low support levels may be missed. We present a new algorithm, weighted interesting sequential (WIS) pattern mining based on a pattern growth method in which new measures, sequential s‐confidence and w‐confidence, are suggested. Using these measures, weighted interesting sequential patterns with similar levels of support and/or weight are mined. The WIS algorithm gives a balance between the measures of support and weight, and considers correlation between items within sequential patterns. A performance analysis shows that WIS is efficient and scalable in weighted sequential pattern mining.  相似文献   

2.
In this paper, we explore a new data mining capability for a mobile commerce environment. To better reflect the customer usage patterns in the mobile commerce environment, we propose an innovative mining model, called mining mobile sequential patterns, which takes both the moving patterns and purchase patterns of customers into consideration. How to strike a compromise among the use of various knowledge to solve the mining on mobile sequential patterns is a challenging issue. We devise three algorithms (algorithm TJLS, algorithm TJPT, and algorithm TJPF) for determining the frequent sequential patterns, which are termed large sequential patterns in this paper, from the mobile transaction sequences. Algorithm TJLS is devised in light of the concept of association rules and is used as the basic scheme. Algorithm TJPT is devised by taking both the concepts of association rules and path traversal patterns into consideration and gains performance improvement by path trimming. Algorithm TJPF is devised by utilizing the pattern family technique which is developed to exploit the relationship between moving and purchase behaviors, and thus is able to generate the large sequential patterns very efficiently. A simulation model for the mobile commerce environment is developed, and a synthetic workload is generated for performance studies. In mining mobile sequential patterns, it is shown by our experimental results that algorithm TJPF significantly outperforms others in both execution efficiency and memory saving, indicating the usefulness of the pattern family technique devised in this paper. It is shown by our results that by taking both moving and purchase patterns into consideration, one can have a better model for a mobile commerce system and is thus able to exploit the intrinsic relationship between these two important factors for the efficient mining of mobile sequential patterns  相似文献   

3.
Anup Bhat B  Harish SV  Geetha M 《ETRI Journal》2021,43(6):1024-1037
Mining high utility itemsets (HUIs) from transaction databases considers such factors as the unit profit and quantity of purchased items. Two-phase tree-based algorithms transform a database into compressed tree structures and generate candidate patterns through a recursive pattern-growth procedure. This procedure requires a lot of memory and time to construct conditional pattern trees. To address this issue, this study employs two compressed tree structures, namely, Utility Count Tree and String Utility Tree, to enumerate valid patterns and thus promote fast utility computation. Furthermore, the study presents an algorithm called single-phase utility computation (SPUC) that leverages these two tree structures to mine HUIs in a single phase by incorporating novel pruning strategies. Experiments conducted on both real and synthetic datasets demonstrate the superior performance of SPUC compared with IHUP, UP-Growth, and UP-Growth+ algorithms.  相似文献   

4.
最大频繁序列挖掘是数据挖掘的重要内容之一.在深入分析频繁序列特点以及已有序列挖掘算法的基础上,提出一种新的最大序列挖掘算法Huffman-MaxSeq.与传统的"候选最大频繁序列集生成——测试"思路不同,该算法采用"边生成候选序列边测试"的思想,从而有效地减少了候选序列的生成.该算法基于构造哈夫曼树(最优树)的方法,对每个序列赋予权值,按权值的大小选取序列,连接生成新的候选频繁序列,再产生最大频繁序列.  相似文献   

5.
Bayesian methods for multiaspect target tracking in image sequences   总被引:2,自引:0,他引:2  
In this paper, we introduce new algorithms for automatic tracking of multiaspect targets in cluttered image sequences. We depart from the conventional correlation filter/Kalman filter association approach to target tracking and propose instead a nonlinear Bayesian methodology that enables direct tracking from the image sequence incorporating the statistical models for the background clutter, target motion, and target aspect change. Proposed algorithms include 1) a batch hidden Markov model (HMM) smoother and a sequential HMM filter for joint multiframe target detection and tracking and 2) two mixed-state sequential importance sampling trackers based on the sampling/importance resampling (SIR) and the auxiliary particle filtering (APF) techniques. Performance studies show that the proposed algorithms outperform the association of a bank of template correlators and a Kalman filter in adverse scenarios of low target-to-clutter ratio and uncertainty in the true target aspect.  相似文献   

6.
提出一种多时间间隔的序列模式挖掘算法,依据挖掘的实际情况设置可变的时间区间,采用有效的剪枝策略,分区间精确显示多时间间隔序列模式挖掘结果.实验证明,算法具有较高的挖掘性能.  相似文献   

7.
非树型网络模体发现算法   总被引:1,自引:0,他引:1  
覃桂敏  高琳  周晓锋 《电子学报》2009,37(11):2420-2426
 现有的大多数网络模体发现算法发现网络中的确切模体,但是由于生物数据是不完整的,有噪声的,而且生命过程具有动态性,概率网络模体具有更实际的意义.本文提出了非树型网络模体发现算法,寻找由一组相似子图组成的概率网络模体.在该算法中,首先提出子图挖掘算法ESN挖掘网络中所有给定规模的非树型子图,然后进行多图比对,最后基于统计模型和对应的得分函数,用模拟退火算法求得网络模体.对E.coli和Yeast的基因调控网络的仿真实验表明,该算法能够高效地发现生物网络中的概率模体.  相似文献   

8.
Sequential and parallel image restoration algorithms and their implementations on neural networks are proposed. For images degraded by linear blur and contaminated by additive white Gaussian noise, maximum a posteriori (MAP) estimation and regularization theory lead to the same high dimension convex optimization problem. The commonly adopted strategy (in using neural networks for image restoration) is to map the objective function of the optimization problem into the energy of a predefined network, taking advantage of its energy minimization properties. Departing from this approach, we propose neural implementations of iterative minimization algorithms which are first proved to converge. The developed schemes are based on modified Hopfield (1985) networks of graded elements, with both sequential and parallel updating schedules. An algorithm supported on a fully standard Hopfield network (binary elements and zero autoconnections) is also considered. Robustness with respect to finite numerical precision is studied, and examples with real images are presented.  相似文献   

9.
提出一种基于最大频繁序列模式的页面推荐技术。由于考虑了用户会话的页面访问顺序,比一些不考虑页面访问顺序的推荐技术有更高的准确率。通过引入一树型结构.其上压缩存储了所有最大频繁序列。由于前缀相同的序列共享共同的树结点,从而大大节省了存储空间。推荐引擎截取用户活动会话中最近被访问的页面子序列,与树的部分路径进行匹配,无需在整个模式库中搜索相同或相似的模式.加快模式匹配的速度.更好地满足页面推荐的实时要求。实验证明,方法是有效的。  相似文献   

10.
Frequency synchronization has a great importance in preserving the performance of the underwater acoustic (UWA) orthogonal frequency division multiplexing (OFDM) systems. The carrier frequency offset (CFO) estimation can be blind or data‐aided. In this paper, the Zadoff‐Chu (ZC) sequences are used for OFDM synchronization in UWA communications, and they are compared with different data‐aided algorithms. We propose a low‐complexity algorithm for CFO estimation based on ZC sequences. Also, a joint equalization and CFO compensation scheme for UWA‐OFDM communication systems is presented. Simulation results demonstrate that the proposed CFO estimation algorithm allows estimation of the CFO accurately with a simple implementation in comparison with the traditional schemes. Also, the performance of the UWA‐OFDM system can be preserved in the presence of frequency offsets.  相似文献   

11.
黄坤  吴玉佳  李晶 《电子学报》2018,46(8):1804-1814
高效用项集挖掘已成为关联规则中的一个热点研究问题.一些基于垂直结构的算法已用来挖掘高效用项集,此类算法的主要优点是将项集的事务和效用信息存储到效用列表中.在求一个项集的超集所在事务可以通过对它的子集进行一次交集运算得到.这种算法在稀疏数据集中非常的有效.但在稠密数据集中存在一个问题,即列表中存储的事务太多,在计算用于剪枝的效用上界时,需要耗费大量的存储空间,同时也影响运行速度.并且在现有的算法中,缺乏针对稠密数据集的高效用项集挖掘算法,往往需要设置很高的最小效用阈值,影响算法的运行效率.针对此问题,提出一个新的算法D-HUI (mining High Utility Itemsets using Diffsets)以及一个新的数据结构—项集列表,首次在高效用项集挖掘中引入差集的概念.利用事务的差集求项集的效用上界,减少计算量以及存储空间,从而提高算法的运行效率.实验结果表明,提出的算法在稠密数据集中,执行速度更快,内存消耗更少.  相似文献   

12.
针对一类信息伪装算法的隐藏信息检测   总被引:7,自引:1,他引:6  
张涛  平西建 《通信学报》2002,23(5):123-129
本文针对一类基于空域LSB替换的信息伪装方法,提出了一种有效的隐藏信息检测技术。本文提出的隐藏信息的检测技术基于图像LSB比特序列的随机性度量的分析。在唯载密图像攻击条件下,通过在图像中嵌入测试信息的方法,建立秘密信息嵌入比例与LSB比特序列随机性度量的逻辑回归模型,从而实现图像中隐藏信息存在性的检测。实验表明对于灰度图像中即使0.4比特每像素的隐藏容量,仍可能获得较高的检测可靠性。这一方法同样适用于真彩图像。  相似文献   

13.
由于软件代码的复杂性,对于不了解框架的新手,很难利用开源社区中的代码来开发软件。因此,利用数据挖掘技术挖掘现有代码中的编程模式成为研究热点。文中介绍了频繁项挖掘Apriori算法,并提出了基于源码模式的软件辅助开发方法。它能够根据用户输入的关键字来智能匹配类库中的特定父类,挖掘基于此父类的编程模式,给出优先要重写的方法以及关联规则。实验结果表明,新手可以利用这些编码建议,快速学习一个新的框架,提高开发效率。  相似文献   

14.
Liao  Jiyong  Wu  Sheng  Liu  Ailian 《Wireless Personal Communications》2021,116(3):1639-1657

High utility itemsets mining has become a hot research topic in association rules mining. But many algorithms directly mine datasets, and there is a problem on dense datasets, that is, too many itemsets stored in each transaction. In the process of mining association rules, it takes a lot of storage space and affects the running efficiency of the algorithm. In the existing algorithms, there is a lack of efficient itemset mining algorithms for dense datasets. Aiming at this problem, a high utility itemsets mining algorithm based on divide-and-conquer strategy is proposed. Using the improved silhouette coefficient to select the best K-means cluster number, the datasets are divided into many smaller subclasses. Then, the association rules mining is performed by Boolean matrix compression operation on each subclass, and iteratively merge them to get the final mining results. We also analyze the time complexity of our method and Apriori algorithm. Finally, experimental results on several well-known real world datasets are conducted to show that the improved algorithm performs faster and consumes less memory on dense datasets, which can effectively improve the computational efficiency of the algorithm.

  相似文献   

15.
基于差分隐私的时间序列模式挖掘方法中,序列的最大长度以及添加拉普拉斯噪声的多少直接制约着挖掘结果的可用性.针对现有时间序列模式挖掘方法全局敏感度过高、挖掘结果可用性较低的不足问题,提出了一种基于序列格的差分隐私下时间序列模式挖掘方法PrivTSM(Differentially Private Time Series Pattern Mining).该方法首先利用最长路径的策略对原始数据库进行截断处理;在此基础上,采用表连接操作生成满足差分隐私的序列格;结合序列格结构本身的特性,合理分配隐私预算,提高输出模式的可用性.理论分析表明PrivTSM方法满足ε-差分隐私,基于真实数据库上实验结果表明,PrivTSM方法的准确率TPR(True Postive Rate)和平均相对误差ARE(Average Relative Error)明显优于N-gram和Prefix-Hybrid方法.  相似文献   

16.
M-ary sequential detection algorithms are discussed in terms of the advantage measure A, the number of possible signals M, and the signal-to-noise ratio SNR. The advantage measure A is defined as the logarithm of the ratio of the error probability of the optimum conventional detector to that of the optimum sequential detector with the same values of M and SNR. It increases as the SNR increases but decreases when M increases. For M → ∞, the advantage measure A diminishes to zero, so that application of the sequential approach is useless for large values of M. Thus, the sequential approach is most useful for binary detection with high signal-to-noise ratios. These results are demonstrated for detection of equiprobable orthogonal signals received in white Gaussian noise.  相似文献   

17.
In many applications, it is desirable to sort the data. Most of previous work on sorting are key based, however, there are no apparent keys for the time-series data and therefore the classic sorting algorithms may fail in sorting time-series data. We propose a novel technique, called TS-Sort, to sort time-series sequences in the massive set. The proposed method first extracts the maximum and minimum boundaries of the set, then calculates the distance values between the sequences to the boundaries, and finally sorts the values to determine the relative orders of sequences in the set. For improvement, we propose a partition based version of the algorithm, which puts the sequences into small groups, and sorts the groups to get the final sorted set. Extensive experiments, both on synthetic and real datasets, show that our approach can be used to make the time series set in order, and there is a factor of up to 26.3% accelerating for the improved version of the method.  相似文献   

18.
In this work, we propose a highly efficient binary tree‐based anti‐collision algorithm for radio frequency identification (RFID) tag identification. The proposed binary splitting modified dynamic tree (BS‐MDT) algorithm employs a binary splitting tree to achieve accurate tag estimation and a modified dynamic tree algorithm for rapid tag identification. We mathematically evaluate the performance of the BS‐MDT algorithm in terms of the system efficiency and the time system efficiency based on the ISO/IEC 18000‐6 Type B standard. The derived mathematical model is validated using computer simulations. Numerical results show that the proposed BS‐MDT algorithm can provide the system efficiency of 46% and time system efficiency of 74%, outperforming all other well‐performed algorithms.  相似文献   

19.
本文提出了有监督的关键词抽取算法——KEING(Keyphrase Extraction using sequentIal patterns with oNe-off and General gaps condition)算法.首先,将每篇文档作为一个序列库,利用SPING(Sequential Patterns mIning with oNe-off and General gaps condition)算法获取词语之间的关系及其多种变化形式,并利用统计模式特征的方式描述候选关键词;然后,通过朴素贝叶斯分类算法对大量带标记的训练数据进行训练,构造分类器;最后利用分类器从测试文档中识别出关键词.通过实验验证了SPING算法的完备性以及KEING算法的有效性.  相似文献   

20.
At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms that used the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号