首页 | 本学科首页   官方微博 | 高级检索  
     


Approximate mining of maximal frequent itemsets in data streams with different window models
Authors:Hua-Fu Li  Suh-Yin Lee  
Affiliation:aDepartment of Computer Science, Kainan University, No.1 Kainan Road, Luzhu Shiang, Taoyuan 338, Taiwan, ROC;bDepartment of Computer Science, National Chiao-Tung University, 1001 Ta-Hsueh Road, Hsinchu 300, Taiwan, ROC
Abstract:A data stream is a massive, open-ended sequence of data elements continuously generated at a rapid rate. Mining data streams is more difficult than mining static databases because the huge, high-speed and continuous characteristics of streaming data. In this paper, we propose a new one-pass algorithm called DSM-MFI (stands for Data Stream Mining for Maximal Frequent Itemsets), which mines the set of all maximal frequent itemsets in landmark windows over data streams. A new summary data structure called summary frequent itemset forest (abbreviated as SFI-forest) is developed for incremental maintaining the essential information about maximal frequent itemsets embedded in the stream so far. Theoretical analysis and experimental studies show that the proposed algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of the data streams.
Keywords:Data mining  Data streams  Maximal frequent itemsets  One-pass mining  Approximate mining
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号