基于Web日志挖掘的Web文档聚类 Web document clustering based on web-log mining期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Web日志挖掘的Web文档聚类

引用本文：	高哲,魏海平,王福威,赵晓碧.基于Web日志挖掘的Web文档聚类[J].计算机工程与设计,2008,29(18).

作者姓名：	高哲魏海平王福威赵晓碧

作者单位：	辽宁石油化工大学,计算机与通信工程学院,辽宁,抚顺,113001

摘要：	Web日志挖掘是Web挖掘的一种,介绍了Web日志挖掘的一般过程,研究了k-means聚类算法,并分析了k-means聚类算法的不足.k-means聚类算法迭代过程中每次都需要计算每个数据对象到簇质心的距离,使得聚类效率不高,针对这个问题,提出了k-means聚类算法的改进算法,该算法避免了重复计算数据对象到簇质心的距离,并用这两种算法实现了Web文档的聚类.试验结果表明,该改进算法提高了聚类效率.
关键词：	日志挖掘 Web日志文档聚类日志预处理
Web document clustering based on web-log mining

GAO Zhe,WEI Hai-ping,WANG Fu-wei,ZHAO Xiao-bi.Web document clustering based on web-log mining[J].Computer Engineering and Design,2008,29(18).

Authors:	GAO Zhe WEI Hai-ping WANG Fu-wei ZHAO Xiao-bi

Affiliation:	GAO Zhe,WEI Hai-ping,WANG Fu-wei,ZHAO Xiao-bi(School of Computer , Communication Engineering,Liaoning Shihua University,Funshun 113001,China)

Abstract:	Web log mining is one of the web mining.The process of the web log mining and the k-means algorithms are introduced.And the shortage of the k-means algorithm is analyzed.The k-means algorithm needs to compute the distance between every data object and the center of the clusters,which lowers the efficiency.To this problem,an enhanced algorithm of the k-means is put forward,which avoids computing the distance between every data object and the center of the clusters.Web document clustering is implemented with ...

Keywords:	k-means
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏