首页 | 本学科首页   官方微博 | 高级检索  
     

基于闭合有间隔频繁子序列的点击流聚类
引用本文:马超,沈微.基于闭合有间隔频繁子序列的点击流聚类[J].计算机工程,2010,36(23):72-75.
作者姓名:马超  沈微
作者单位:(东北林业大学工程技术学院, 哈尔滨 150040)
摘    要:对网站日志文件中记录的点击流序列聚类可以发现用户使用模式,从而对用户归类。而传统聚类方法面临着难以提取点击流中有代表性的特征向量以及点击流及其特征向量存在数据稀疏性的问题。针对上述情况,提出一种基于闭合有间隔频繁子序列模式挖掘的点击流聚类方法。该方法从点击流中提取子序列模式的频繁支持度,构建特征向量,利用基于双向映射欧氏距离的模糊距离度量判断向量间相似度,增强BIRCH聚类算法对点击流数据的聚类效果。

关 键 词:点击流  聚类  频繁子序列模式  网络使用挖掘

Clickstream Clustering Based on Closed Frequent Gapped Subsequence
MA Chao,SHEN Wei.Clickstream Clustering Based on Closed Frequent Gapped Subsequence[J].Computer Engineering,2010,36(23):72-75.
Authors:MA Chao  SHEN Wei
Affiliation:(College of Engineering and Technology, Northeast Forestry University, Harbin 150040, China)
Abstract:Clustering of clickstreams in Web-logs can find Web visitors' using patterns,and categorize these visitors.However,traditional clustering method faces challenge of extracting representative feature vector,sparse clickstreams and feature vector.To solve the problems,a closed repetitive gapped subsequence mining based clickstream clustering method is proposed.Extract repetitive support of subsequence from clickstream,and construct feature vector.A bidirectional projected Euclidean distance based on fuzzy dissimilarity is proposed and used as distance measure of feature vectors.Clustering quality of BIRCH algorithm on clickstream is enhanced.
Keywords:clickstream  clustering  frequent subsequence pattern  Web-usage mining
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号