首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于动态时间阈值的会话识别方法
引用本文:戴智丽,王鑫昱. 一种基于动态时间阈值的会话识别方法[J]. 计算机应用与软件, 2010, 27(2): 244-246
作者姓名:戴智丽  王鑫昱
作者单位:燕山大学信息科学与工程学院,河北,秦皇岛,066004
摘    要:会话识别是Web日志挖掘的关键步骤,会话识别的质量直接影响后续挖掘的准确性。在Timeout方法固定时间阈值的基础上,提出动态时间阈值,通过对样本日志的分析,得到不同时段的时间阈值。在处理日志文件时,根据当前会话开始记录的访问时间选择时间阈值。实验表明,该方法识别会话的质量比Timeout方法有了明显提高。

关 键 词:Web目志挖掘  数据预处理  会话识别

METHOD OF SESSIONS IDENTIFICATION BASED ON DYNAMIC THRESHOLD OF TIME
Dai Zhili,Wang Xinyu. METHOD OF SESSIONS IDENTIFICATION BASED ON DYNAMIC THRESHOLD OF TIME[J]. Computer Applications and Software, 2010, 27(2): 244-246
Authors:Dai Zhili  Wang Xinyu
Affiliation:College of Information Science and Engineering/a>;Yanshan University/a>;Qinhuangdao 066004/a>;Hebei/a>;China
Abstract:The sessions' identification is a key step in Web log mining.The accuracy of post-mining is influenced by the quality of the sessions' identification directly.In this paper,based on fixed threshold of time in Timeout,dynamic threshold of time is proposed.With the sample of log analysed,thresholds of different time are obtained.While log file is processed to identify sessions,the threshold of time is selected according to the access time of the beginning record in current session.The quality of sessions iden...
Keywords:Web log mining Data pre-processing Sessions' identification  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号