基于MapReduce的Web日志挖掘 Weblog mining based on MapReduce期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于MapReduce的Web日志挖掘

引用本文：	李彬,刘莉莉.基于MapReduce的Web日志挖掘[J].计算机工程与应用,2012,48(22):95-98.

作者姓名：	李彬刘莉莉

作者单位：	中国矿业大学计算机科学与技术学院,江苏徐州,221116

摘要：	针对单一CPU节点的Web数据挖掘系统在挖掘Web海量数据源时存在的计算瓶颈问题,利用云计算的分布式处理和虚拟化技术优势以及蚁群算法并行性的优点,设计一种基于Map/Reduce架构的Web日志挖掘算法。为进一步验证该算法的高效性,通过搭建Hadoop平台,利用该算法挖掘Web日志中用户的偏爱访问路径。实验结果表明,充分利用了集群系统的分布式计算能力处理大量的Web日志文件,可以大大地提高Web数据挖掘的效率。
关键词：	云计算 Map/Reduce Hadoop平台 Web日志挖掘蚁群算法
Weblog mining based on MapReduce

LI Bin , LIU Lili.Weblog mining based on MapReduce[J].Computer Engineering and Applications,2012,48(22):95-98.

Authors:	LI Bin LIU Lili

Affiliation:	School of Computer Science and Technology,China University of Mining and Technology,Xuzhou,Jiangsu 221116,China

Abstract:	The current data mining system based on single CPU has developed to a bottleneck to deal with mass data from Web.Using the advantage of cloud computing distributed processing,virtualization and parallelism of ant colony algorithm,this paper presents a weblog mining algorithm based on Map/Reduce’s framework.To further verify the high efficiency of the algorithm,it uses the algorithm to mine users’preferred access path based on Hadoop platform.Experimental results show that,using distributed algorithm to process large number of Weblog files in the cluster,can significantly improve the efficiency of Web data mining.

Keywords:	cloud computing Map/Reduce Hadoop platform Web log mining ant colony algorithm
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏