首页 | 本学科首页   官方微博 | 高级检索  
     


Streaming association rule (SAR) mining with a weighted order-dependent representation of Web navigation patterns
Authors:YongSeog Kim
Affiliation:1. Department of Computer Engineering, Sejong University, Seoul, Republic of Korea;2. Department of Electronics Engineering, Konkuk University, Seoul, Republic of Korea;3. Faculty of Software and Information Science, Iwate Prefectural University (IPU), Iwate, Japan
Abstract:This paper considers a problem of finding predictive and useful association rules with a new Web mining algorithm, a streaming association rule (SAR) model. We first adopt a weighted order-dependent scheme (assigning more weights for early visited pages) rather than taking a traditional Boolean scheme (assigning 1 for visited and 0 for non-visited pages). This way, we intend to improve the limited representation of navigation patterns in previous association rule mining (ARM) algorithms. We also note that most traditional association rule models are not scalable because they require multiple scans of all records to re-calibrate a predictive model when there are new updates in original databases. The proposed SAR model takes a “divide-and-conquer” approach and requires only single scan of data sets to avoid the curse of dimensionality. Through comparative experiments on a real-world data set, we show that prediction models based on a weighted order-dependent representation are more accurate in predicting the next moves of Web navigators than models based on a Boolean representation. In particular, when combined with several heuristics developed to eliminate redundant association rules, SAR models show a very comparable prediction accuracy while maintaining a small fraction of association rules compared to traditional ARM models. Finally, we quantify and graphically show the significance or contribution of each pages to forming unique rule sets in each database segments.
Keywords:Web mining  Association rule  Streaming association rule  Weighted order-dependent representation
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号