首页 | 本学科首页   官方微博 | 高级检索  
     

基于MapReduce的新会话识别方法
引用本文:黄伟建,宋园园.基于MapReduce的新会话识别方法[J].计算机工程与科学,2016,38(3):425-430.
作者姓名:黄伟建  宋园园
作者单位:;1.河北工程大学信息与电气工程学院
基金项目:河北省自然科学基金(F2015402077);河北省高等学校科学技术研究重点项目(ZD2014054)
摘    要:Web日志预处理因其输出结果的重要性而受到越来越多的重视,同时Hadoop对海量数据的分布式处理也得到广泛研究和应用,因此使用MapReduce进行Web日志预处理成为一种必然的发展趋势。为了提高会话识别结果的准确率,在分析会话识别算法研究现状的基础上,提出一种基于网络拓扑结构和动态阈值相结合的新会话识别方法并讨论其优势所在,接着用MapReduce模型实现新方法的分布式处理,最后通过对比实验分析验证MapReduce模型实现新算法的高效性和高精确度。

关 键 词:Web日志预处理  会话识别  MapReduce  分布式处理
收稿时间:2015-04-27
修稿时间:2016-03-25

A new session identification method based on MapReduce
HUANG Wei jian,SONG Yuan yuan.A new session identification method based on MapReduce[J].Computer Engineering & Science,2016,38(3):425-430.
Authors:HUANG Wei jian  SONG Yuan yuan
Affiliation:(School of Information and Electrical Engineering,Hebei University of Engineering,Handan 056038,China)
Abstract:Web log preprocessing attracts more and more attention due to the importance of its output result. Meanwhile distributed processing of massive data based on Hadoop is being widely studied and applied, so Web log preprocessing with MapReduce becomes an inevitable development trend. In order to improve the accuracy of session identification results, we propose a new method to identify user session based on network topology and dynamic threshold. The current research state is analyzed and the advantages of this method are also discussed. Then, the MapReduce model is used to implement the distributed processing of the new method. Experimental results demonstrate high efficiency and high accuracy of the proposed method.
Keywords:Web log preprocessing  session identification  MapReduce  distributed processing  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号