模糊C均值聚类算法在Web使用挖掘上的应用研究 Research on Application of Fuzzy C-Means Algorithm in Web Usage Mining期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

模糊C均值聚类算法在Web使用挖掘上的应用研究

引用本文：	吴瑛,;王秋生.模糊C均值聚类算法在Web使用挖掘上的应用研究[J].微机发展,2008(6):32-35.

作者姓名：	吴瑛 ;王秋生

作者单位：	北京航空航天大学自动化科学与电气工程学院北京100083

摘要：	Web日志中含有大量的用户浏览信息，从中将相似用户及相关页面进行聚类是建立自适应网站的必要前提。通过基本的预处理，实现了日志的数据净化、用户识别会话识别及数据规约，形成了用户访问页面的序列数据库，同时通过离散化技术计算出用户访问页面频度。在这些数据准备工作的基础上，构造了用户一页面关联矩阵，作为改进的模糊C均值聚类算法的输入，实现了相似用户及相关页面的聚类。实验表明改进的FCM算法的有效性。
关键词：	模糊C均值聚类 Web日志预处理关联矩阵用户聚类页面聚类
Research on Application of Fuzzy C-Means Algorithm in Web Usage Mining

WU Ying,WANG Qiu-sheng.Research on Application of Fuzzy C-Means Algorithm in Web Usage Mining[J].Microcomputer Development,2008(6):32-35.

Authors:	WU Ying WANG Qiu-sheng

Affiliation:	WU Ying,WANG Qiu-sheng ( Institute of Automation Science and Electricity Engineering, Beihang University, Beijing 100083, China)

Abstract:	Web logs contain a lot of user browsing information.Clustering of similar customers and relative pages is necessary for creating adaptive web sites.Implements the web log's cleaning,user-recognizing,session-recognizing and data convention by means of preprocessing technology.Then a user-page sequence database can be achieved.Simultaneously,the frequency of the user's visit is added to the database.After all these preparation work,can get the associated matrix which is also the input of the improved fuzzy c-means algorithm.Finally realize the clustering of similar customers and relative pages.The result of experiment shows the validity of the algorithm.

Keywords:	fuzzy c-means algorithm Web log's data preparation associated matrix customer-clustering page-clustering
本文献已被 CNKI 维普等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏