首页 | 本学科首页   官方微博 | 高级检索  
     


Mining top-K frequent itemsets from data streams
Authors:Raymond Chi-Wing Wong  Ada Wai-Chee Fu
Affiliation:(1) Department of Computer Science and Engineering, The Chinese University of Hong Kong, Ho Sin-Hang Engineering Building, Shatin N.T., Hong Kong, P. R. China
Abstract:Frequent pattern mining on data streams is of interest recently. However, it is not easy for users to determine a proper frequency threshold. It is more reasonable to ask users to set a bound on the result size. We study the problem of mining top K frequent itemsets in data streams. We introduce a method based on the Chernoff bound with a guarantee of the output quality and also a bound on the memory usage. We also propose an algorithm based on the Lossy Counting Algorithm. In most of the experiments of the two proposed algorithms, we obtain perfect solutions and the memory space occupied by our algorithms is very small. Besides, we also propose the adapted approach of these two algorithms in order to handle the case when we are interested in mining the data in a sliding window. The experiments show that the results are accurate.
Contact Information Ada Wai-Chee FuEmail:
Keywords:Data mining algorithm  Data stream  Top K frequent itemset mining  Sliding window  Chernoff bound  Probabilistic algorithm
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号